Dataset
- class Dataset[source]
Bases:
ExtraReprMixin
The base dataset class.
Attributes Summary
Return whether inverse triples are created for the training factory.
The mapping of entity labels to IDs.
Return a dictionary of the three factories.
the dataset's name
The number of entities.
The number of relations.
The mapping of relation labels to IDs.
Methods Summary
cli
()Run the CLI.
deteriorate
(n[, random_state])Deteriorate n triples from the dataset's training with
pykeen.triples.deteriorate.deteriorate()
.docdata
(*parts)Get docdata for this class.
from_directory_binary
(path)Load a dataset from a directory.
from_path
(path[, ratios])Create a dataset from a single triples factory by splitting it in 3.
from_tf
(tf[, ratios])Create a dataset from a single triples factory by splitting it in 3.
Get the normalized name of the dataset.
Yield extra entries for the instance's string representation.
remix
([random_state])Remix a dataset using
pykeen.triples.remix.remix()
.similarity
(other[, metric])Compute the similarity between two shuffles of the same dataset.
summarize
([title, show_examples, file])Print a summary of the dataset.
summary_str
([title, show_examples, end])Make a summary string of all of the factories.
to_directory_binary
(path)Store a dataset to a path in binary format.
triples_pair_sort_key
(pair)Get the number of triples for sorting in an iterator context.
triples_sort_key
(cls)Get the number of triples for sorting.
Attributes Documentation
- create_inverse_triples
Return whether inverse triples are created for the training factory.
- entity_to_id
The mapping of entity labels to IDs.
- factory_dict
Return a dictionary of the three factories.
- Return type:
- num_entities
The number of entities.
- num_relations
The number of relations.
- relation_to_id
The mapping of relation labels to IDs.
Methods Documentation
- deteriorate(n, random_state=None)[source]
Deteriorate n triples from the dataset’s training with
pykeen.triples.deteriorate.deteriorate()
.
- classmethod from_path(path, ratios=None)[source]
Create a dataset from a single triples factory by splitting it in 3.
- static from_tf(tf, ratios=None)[source]
Create a dataset from a single triples factory by splitting it in 3.
- Return type:
- Parameters:
tf (TriplesFactory) –
- remix(random_state=None, **kwargs)[source]
Remix a dataset using
pykeen.triples.remix.remix()
.
- similarity(other, metric=None)[source]
Compute the similarity between two shuffles of the same dataset.
- Parameters:
- Return type:
- Returns:
A float of the similarity
See also
pykeen.triples.triples_factory.splits_similarity()
.
- summary_str(title=None, show_examples=5, end='\\n')[source]
Make a summary string of all of the factories.