Datasets¶
pykeen.datasets Package¶
Sample datasets for use with PyKEEN, borrowed from https://github.com/ZhenfengLei/KGDatasets.
New datasets (inheriting from pykeen.datasets.base.Dataset
) can be registered with PyKEEN using the
pykeen.datasets
group in Python entrypoints in your own setup.py or setup.cfg package configuration.
They are loaded automatically with pkg_resources.iter_entry_points()
.
Functions¶
|
Get the dataset. |
|
Return if the dataset is registered in PyKEEN. |
Classes¶
|
The Hetionet dataset is a large biological network. |
|
The Kinships dataset. |
|
The Nations dataset. |
|
The OpenBioLink dataset. |
|
The low-quality variant of the OpenBioLink dataset. |
|
The CoDEx small dataset. |
|
The CoDEx medium dataset. |
|
The CoDEx large dataset. |
|
The OGB BioKG dataset. |
|
The OGB WikiKG dataset. |
|
The UMLS dataset. |
|
The FB15k dataset. |
|
The FB15k-237 dataset. |
|
The WK3l-15k dataset family. |
|
The WN18 dataset. |
|
The WN18-RR dataset. |
|
The YAGO3-10 dataset is a subset of YAGO3 that only contains entities with at least 10 relations. |
|
The DRKG dataset. |
|
The ConceptNet dataset from [speer2017]. |
|
The Clinical Knowledge Graph (CKG) dataset from [santos2020]. |
|
The CSKG dataset. |
|
The DBpedia50 dataset. |
|
The DB100K dataset from [ding2018]. |
|
The Countries dataset. |
Class Inheritance Diagram¶
pykeen.datasets.base Module¶
Utility classes for constructing datasets.
Functions¶
|
Calculate the similarity between two datasets. |
Classes¶
|
Contains a lazy reference to a training, testing, and validation dataset. |
|
A dataset that has already been loaded. |
A dataset that has lazy loading. |
|
|
Contains a lazy reference to a training, testing, and validation dataset. |
|
Contains a lazy reference to a remote dataset that is loaded if needed. |
|
A dataset with all three of train, test, and validation sets as URLs. |
|
A remote dataset stored as a tar file. |
|
Contains a lazy reference to a remote dataset that is loaded if needed. |
|
Loads a dataset that’s a single file inside a tar.gz archive. |
|
This class is for when you’ve got a single TSV of edges and want them to get auto-split. |
|
This class is for when you’ve got a single TSV of edges and want them to get auto-split. |
Class Inheritance Diagram¶
pykeen.datasets.analysis Module¶
Dataset analysis utilities.
Functions¶
|
Create a dataframe with relation counts. |
|
Create a dataframe with entity counts. |
|
Create a dataframe of entity/relation co-occurrence. |
|
Calculate the functionality and inverse functionality score per relation. |
|
Categorize relations based on patterns from RotatE [sun2019]. |
|
Determine the relation cardinality types. |