Datasets¶
Sample datasets for use with PyKEEN, borrowed from https://github.com/ZhenfengLei/KGDatasets.
New datasets (inheriting from pykeen.datasets.base.DataSet
) can be registered with PyKEEN using the
pykeen.datasets group in Python entrypoints in your own setup.py or setup.cfg package configuration.
They are loaded automatically with pkg_resources.iter_entry_points()
.
Functions¶
|
Get the dataset. |
|
Return if the dataset is registered in PyKEEN. |
Classes¶
|
The Hetionet dataset is a large biological network. |
|
The Kinships data set. |
|
The Nations data set. |
|
The OpenBioLink dataset. |
|
The PyKEEN First Filtered OpenBioLink 2020 Dataset. |
|
The PyKEEN Second Filtered OpenBioLink 2020 Dataset. |
|
The low-quality variant of the OpenBioLink dataset. |
|
The UMLS data set. |
|
The FB15k data set. |
|
The FB15k-237 data set. |
|
The WN18 data set. |
|
The WN18-RR data set. |
|
The YAGO3-10 data set is a subset of YAGO3 that only contains entities with at least 10 relations. |
Class Inheritance Diagram¶
Utility classes for constructing datasets.
Classes¶
|
Contains a lazy reference to a training, testing, and validation data set. |
|
A dataset that has already been loaded. |
A data set that has lazy loading. |
|
|
Contains a lazy reference to a training, testing, and validation data set. |
|
Contains a lazy reference to a remote dataset that is loaded if needed. |
|
A remote dataset stored as a tar file. |
|
A remote dataset stored as a zip file. |
|
Contains a lazy reference to a remote dataset that is loaded if needed. |
|
This class is for when you’ve got a single TSV of edges and want them to get auto-split. |