Dataset
- class Dataset[source]
Bases:
ExtraReprMixinThe base dataset class.
Attributes Summary
Return whether inverse triples are created for the training factory.
The mapping of entity labels to IDs.
Return a dictionary of the three factories.
the dataset's name
The number of entities.
The number of relations.
The mapping of relation labels to IDs.
Methods Summary
cli()Run the CLI.
deteriorate(n[, random_state])Deteriorate n triples from the dataset's training with
pykeen.triples.deteriorate.deteriorate().docdata(*parts)Get docdata for this class.
from_directory_binary(path)Load a dataset from a directory.
from_path(path[, ratios])Create a dataset from a single triples factory by splitting it in 3.
from_tf(tf[, ratios])Create a dataset from a single triples factory by splitting it in 3.
Get the normalized name of the dataset.
Yield extra entries for the instance's string representation.
remix([random_state])Remix a dataset using
pykeen.triples.remix.remix().similarity(other[, metric])Compute the similarity between two shuffles of the same dataset.
summarize([title, show_examples, file])Print a summary of the dataset.
summary_str([title, show_examples, end])Make a summary string of all of the factories.
to_directory_binary(path)Store a dataset to a path in binary format.
triples_pair_sort_key(pair)Get the number of triples for sorting in an iterator context.
triples_sort_key(cls)Get the number of triples for sorting.
Attributes Documentation
- create_inverse_triples
Return whether inverse triples are created for the training factory.
- entity_to_id
The mapping of entity labels to IDs.
- factory_dict
Return a dictionary of the three factories.
- Return type:
- num_entities
The number of entities.
- num_relations
The number of relations.
- relation_to_id
The mapping of relation labels to IDs.
Methods Documentation
- deteriorate(n, random_state=None)[source]
Deteriorate n triples from the dataset’s training with
pykeen.triples.deteriorate.deteriorate().
- classmethod from_path(path, ratios=None)[source]
Create a dataset from a single triples factory by splitting it in 3.
- static from_tf(tf, ratios=None)[source]
Create a dataset from a single triples factory by splitting it in 3.
- Return type:
- Parameters:
tf (TriplesFactory) –
- remix(random_state=None, **kwargs)[source]
Remix a dataset using
pykeen.triples.remix.remix().
- similarity(other, metric=None)[source]
Compute the similarity between two shuffles of the same dataset.
- Parameters:
- Return type:
- Returns:
A float of the similarity
See also
pykeen.triples.triples_factory.splits_similarity().
- summary_str(title=None, show_examples=5, end='\\n')[source]
Make a summary string of all of the factories.