Dataset
- class Dataset[source]
Bases:
object
The base dataset class.
Attributes Summary
The mapping of entity labels to IDs.
Return a dictionary of the three factories.
the dataset's name
The number of entities.
The number of relations.
The mapping of relation labels to IDs.
Methods Summary
cli
()Run the CLI.
deteriorate
(n[, random_state])Deteriorate n triples from the dataset's training with
pykeen.triples.deteriorate.deteriorate()
.docdata
(*parts)Get docdata for this class.
from_directory_binary
(path)Load a dataset from a directory.
from_path
(path[, ratios])Create a dataset from a single triples factory by splitting it in 3.
from_tf
(tf[, ratios])Create a dataset from a single triples factory by splitting it in 3.
Get the normalized name of the dataset.
remix
([random_state])Remix a dataset using
pykeen.triples.remix.remix()
.similarity
(other[, metric])Compute the similarity between two shuffles of the same dataset.
summarize
([title, show_examples, file])Print a summary of the dataset.
summary_str
([title, show_examples, end])Make a summary string of all of the factories.
to_directory_binary
(path)Store a dataset to a path in binary format.
triples_pair_sort_key
(pair)Get the number of triples for sorting in an iterator context.
triples_sort_key
(cls)Get the number of triples for sorting.
Attributes Documentation
- entity_to_id
The mapping of entity labels to IDs.
- factory_dict
Return a dictionary of the three factories.
- Return type
- num_entities
The number of entities.
- num_relations
The number of relations.
- relation_to_id
The mapping of relation labels to IDs.
Methods Documentation
- deteriorate(n, random_state=None)[source]
Deteriorate n triples from the dataset’s training with
pykeen.triples.deteriorate.deteriorate()
.- Return type
- classmethod from_path(path, ratios=None)[source]
Create a dataset from a single triples factory by splitting it in 3.
- Return type
- static from_tf(tf, ratios=None)[source]
Create a dataset from a single triples factory by splitting it in 3.
- Return type
- remix(random_state=None, **kwargs)[source]
Remix a dataset using
pykeen.triples.remix.remix()
.- Return type
- similarity(other, metric=None)[source]
Compute the similarity between two shuffles of the same dataset.
- Parameters
- Return type
- Returns
A float of the similarity
See also
pykeen.triples.triples_factory.splits_similarity()
.
- summarize(title=None, show_examples=5, file=None)[source]
Print a summary of the dataset.
- Return type
- summary_str(title=None, show_examples=5, end='\\n')[source]
Make a summary string of all of the factories.
- Return type