Datasets
pykeen.datasets Package
Built-in datasets for PyKEEN.
New datasets (inheriting from pykeen.datasets.Dataset
) can be registered with PyKEEN using the
pykeen.datasets
group in Python entrypoints in your own setup.py or setup.cfg package configuration.
They are loaded automatically with pkg_resources.iter_entry_points()
.
Functions
|
Get a dataset, cached based on the given kwargs. |
|
Return if the dataset is registered in PyKEEN. |
Classes
|
The base dataset class. |
|
The Aristo-v4 dataset from [chen2021]. |
|
The Hetionet dataset from [himmelstein2017]. |
|
The Kinships dataset. |
|
The Nations dataset. |
|
The OpenBioLink dataset. |
|
The low-quality variant of the OpenBioLink dataset. |
|
The CoDEx small dataset. |
|
The CoDEx medium dataset. |
|
The CoDEx large dataset. |
|
The CN3l dataset family. |
|
The OGB BioKG dataset. |
|
The OGB WikiKG2 dataset. |
|
The UMLS dataset. |
|
The FB15k dataset. |
|
The FB15k-237 dataset. |
|
The WK3l-15k dataset family. |
|
The WK3l-120k dataset family. |
|
The WN18 dataset. |
|
The WN18-RR dataset. |
|
The YAGO3-10 dataset is a subset of YAGO3 that only contains entities with at least 10 relations. |
|
The DRKG dataset. |
|
The BioKG dataset from [walsh2020]. |
|
The ConceptNet dataset from [speer2017]. |
|
The Clinical Knowledge Graph (CKG) dataset from [santos2020]. |
|
The CSKG dataset. |
|
The DBpedia50 dataset. |
|
The DB100K dataset from [ding2018]. |
|
The OpenEA dataset family. |
|
The Countries dataset. |
|
The triples-only version of WD50K. |
|
The Wikidata5M dataset from [wang2019]. |
|
The PharmKG8k dataset from [zheng2020]. |
|
The PharmKGFull dataset from [zheng2020]. |
|
The Precision Medicine Knowledge Graph (PrimeKG) dataset from [chandak2022]. |
|
The Global Biotic Interactions (GloBI) dataset. |
Class Inheritance Diagram

pykeen.datasets.base Module
Utility classes for constructing datasets.
Functions
|
Calculate the similarity between two datasets. |
Classes
|
The base dataset class. |
|
A dataset whose training, testing, and optional validation factories are pre-loaded. |
A dataset whose training, testing, and optional validation factories are lazily loaded. |
|
|
Contains a lazy reference to a training, testing, and validation dataset. |
|
Contains a lazy reference to a remote dataset that is loaded if needed. |
|
A dataset with all three of train, test, and validation sets as URLs. |
|
A remote dataset stored as a tar file. |
|
Contains a lazy reference to a remote dataset that is loaded if needed. |
|
Loads a dataset that's a single file inside an archive. |
|
Loads a dataset that's a single file inside a tar.gz archive. |
|
Loads a dataset that's a single file inside a zip archive. |
|
This class is for when you've got a single TSV of edges and want them to get auto-split. |
|
This class is for when you've got a single TSV of edges and want them to get auto-split. |
Class Inheritance Diagram

pykeen.datasets.analysis Module
Dataset analysis utilities.
Functions
|
Create a dataframe with relation counts. |
|
Create a dataframe with entity counts. |
|
Create a dataframe of entity/relation co-occurrence. |
|
Calculate the functionality and inverse functionality score per relation. |
|
Categorize relations based on patterns from RotatE [sun2019]. |
|
Determine the relation cardinality types. |
Inductive Datasets
pykeen.datasets.inductive Package
Inductive models in PyKEEN.
Classes
Contains transductive train and inductive inference/validation/test datasets. |
|
|
An eager inductive datasets. |
An inductive dataset that has lazy loading. |
|
|
A disjoint inductive dataset specified by paths. |
A dataset with all four of train, inductive_inference, inductive test, and inductive validation sets as URLs. |
|
|
The inductive FB15k-237 dataset in 4 versions. |
|
The inductive WN18RR dataset in 4 versions. |
|
The inductive NELL dataset in 4 versions. |
|
An inductive link prediction dataset for the ILPC 2022 Challenge. |
|
An inductive link prediction dataset for the ILPC 2022 Challenge. |
Class Inheritance Diagram

Entity Alignment
pykeen.datasets.ea.combination Module
Combination strategies for entity alignment datasets.
Classes
A base class for combination of a graph pair into a single graph. |
|
This combinator keeps both graphs as disconnected components. |
|
Add extra triples by swapping aligned entities. |
|
This combinator keeps all entities, but introduces a novel alignment relation. |
|
This combinator merges all matching entity pairs into a single ID. |
|
|
The result of processing a pair of triples factories. |
Class Inheritance Diagram
