Dataset¶

class Dataset[source]¶

Contains a lazy reference to a training, testing, and validation dataset.

Attributes Summary

`entity_to_id`	The mapping of entity labels to IDs.
`num_entities`	The number of entities.
`num_relations`	The number of relations.
`relation_to_id`	The mapping of relation labels to IDs.

Methods Summary

`cli`()	Run the CLI.
`deteriorate`(n[, random_state])	Deteriorate n triples from the dataset’s training with `pykeen.triples.deteriorate.deteriorate()`.
`from_path`(path[, ratios])	Create a dataset from a single triples factory by splitting it in 3.
`from_tf`(tf[, ratios])	Create a dataset from a single triples factory by splitting it in 3.
`get_normalized_name`()	Get the normalized name of the dataset.
`remix`([random_state])	Remix a dataset using `pykeen.triples.remix.remix()`.
`similarity`(other[, metric])	Compute the similarity between two shuffles of the same dataset.
`summarize`([title, file])	Print a summary of the dataset.
`summary_str`([title, show_examples, end])	Make a summary string of all of the factories.

Attributes Documentation

Methods Documentation

classmethod cli()[source]¶

Run the CLI.

deteriorate(n, random_state=None)[source]¶

Deteriorate n triples from the dataset’s training with pykeen.triples.deteriorate.deteriorate().

classmethod from_path(path, ratios=None)[source]¶

Create a dataset from a single triples factory by splitting it in 3.

static from_tf(tf, ratios=None)[source]¶

Create a dataset from a single triples factory by splitting it in 3.

classmethod get_normalized_name()[source]¶

Get the normalized name of the dataset.

remix(random_state=None, **kwargs)[source]¶

similarity(other, metric=None)[source]¶

Compute the similarity between two shuffles of the same dataset.

Parameters

Return type

float

Returns

A float of the similarity