WikidataTextRepresentation
- class WikidataTextRepresentation(identifiers: Sequence[str], cache: TextCache | None = None, **kwargs)[source]
Bases:
CachedTextRepresentationTextual representations for datasets grounded in Wikidata.
The label and description for each entity are obtained from Wikidata using
WikidataTextCacheand encoded withTextRepresentation.Example usage:
"""Example for using WikidataTextRepresentation.""" from pykeen.datasets import get_dataset from pykeen.models import ERModel from pykeen.nn import WikidataTextRepresentation from pykeen.pipeline import pipeline dataset = get_dataset(dataset="codexsmall") entity_representations = WikidataTextRepresentation.from_dataset( dataset=dataset, encoder="transformer", ) result = pipeline( dataset=dataset, model=ERModel, model_kwargs={ "interaction": "distmult", "entity_representations": entity_representations, "relation_representation_kwargs": { "shape": entity_representations.shape, }, }, )
Initialize the representation.
- Parameters:
identifiers (Sequence[str]) – the IDs to be resolved by the class, e.g., wikidata IDs. for
WikidataTextRepresentation, biomedical entities represented as compact URIs (CURIEs) forBiomedicalCURIERepresentationcache (TextCache | None) – a pre-instantiated text cache. If None,
cache_clsis used to instantiate one.kwargs – additional keyword-based parameters passed to
TextRepresentation.__init__()