TarFileSingleDataset

class TarFileSingleDataset(url, relative_path, name=None, cache_root=None, eager=False, create_inverse_triples=False, delimiter=None, random_state=None, randomize_cleanup=False)[source]

Bases: pykeen.datasets.base.LazyDataset

Loads a dataset that’s a single file inside a tar.gz archive.

Initialize dataset.

Parameters
  • url (str) – The url where to download the dataset from

  • name (Optional[str]) – The name of the file. If not given, tries to get the name from the end of the URL

  • cache_root (Optional[str]) – An optional directory to store the extracted files. Is none is given, the default PyKEEN directory is used. This is defined either by the environment variable PYKEEN_HOME or defaults to ~/.pykeen.

  • relative_path (str) – The path inside the archive to the contained dataset.

  • random_state (Union[None, int, Generator]) – An optional random state to make the training/testing/validation split reproducible.

  • delimiter (Optional[str]) – The delimiter for the contained dataset.

Attributes Summary

ratios

Attributes Documentation

ratios = (0.8, 0.1, 0.1)