SingleTabbedDataset

class SingleTabbedDataset(url, name=None, cache_root=None, eager=False, create_inverse_triples=False, random_state=None, download_kwargs=None, read_csv_kwargs=None)[source]

Bases: TabbedDataset

This class is for when you’ve got a single TSV of edges and want them to get auto-split.

Initialize dataset.

Parameters:
  • url (str) – The url where to download the dataset from

  • name (Optional[str]) – The name of the file. If not given, tries to get the name from the end of the URL

  • cache_root (Optional[str]) – An optional directory to store the extracted files. Is none is given, the default PyKEEN directory is used. This is defined either by the environment variable PYKEEN_HOME or defaults to ~/.pykeen.

  • eager (bool) – Should the data be loaded eagerly? Defaults to false.

  • create_inverse_triples (bool) – Should inverse triples be created? Defaults to false.

  • random_state (Union[None, int, Generator]) – An optional random state to make the training/testing/validation split reproducible.

  • download_kwargs (Optional[Dict[str, Any]]) – Keyword arguments to pass through to pystow.utils.download().

  • read_csv_kwargs (Optional[Dict[str, Any]]) – Keyword arguments to pass through to pandas.read_csv().

Raises:

ValueError – if there’s no URL specified and there is no data already at the calculated path

Attributes Summary

ratios

Attributes Documentation

ratios: ClassVar[Sequence[float]] = (0.8, 0.1, 0.1)