TextRepresentation
- class TextRepresentation(labels: Sequence[str | None], max_id: int | None = None, shape: int | Sequence[int] | None = None, encoder: str | TextEncoder | type[TextEncoder] | None = None, encoder_kwargs: Mapping[str, Any] | None = None, missing_action: Literal['blank', 'error'] = 'error', **kwargs: Any)[source]
Bases:
RepresentationTextual representations using a text encoder on labels.
Example Usage:
Entity representations are obtained by encoding the labels with a Transformer model. The transformer model becomes part of the KGE model, and its parameters are trained jointly.
from pykeen.datasets import get_dataset from pykeen.nn.representation import TextRepresentation from pykeen.models import ERModel dataset = get_dataset(dataset="nations") entity_representations = TextRepresentation.from_dataset( dataset=dataset, encoder="transformer", ) model = ERModel( interaction="ermlp", entity_representations=entity_representations, relation_representations_kwargs=dict(shape=entity_representations.shape), )
Initialize the representation.
- Parameters:
labels (Sequence[str | None]) – an ordered, finite collection of labels
max_id (int | None) – the number of representations. If provided, has to match the number of labels
shape (OneOrSequence[int] | None) – The shape of an individual representation.
encoder (HintOrType[TextEncoder]) –
the text encoder, or a hint thereof. This can be one of:
’characterembedding’ for
pykeen.nn.text.CharacterEmbeddingTextEncoder’transformer’ for
pykeen.nn.text.TransformerTextEncoderor any other loaded via
pykeen.nn.text.text_encoder_resolver
encoder_kwargs (OptionalKwargs) – keyword-based parameters used to instantiate the text encoder
missing_action (Literal['blank', 'error']) – Which policy for handling nones in the given labels. If “error”, raises an error on any nones. If “blank”, replaces nones with an empty string.
kwargs (Any) – additional keyword-based parameters passed to
Representation.__init__()
- Raises:
ValueError – if the max_id does not match
Note
The parameter pair
(encoder, encoder_kwargs)is used fortext_encoder_resolverAn explanation of resolvers and how to use them is given in https://class-resolver.readthedocs.io/en/latest/.
Methods Summary
from_dataset(dataset, **kwargs)Prepare text representation with labels from a dataset.
from_triples_factory(triples_factory[, ...])Prepare a text representations with labels from a triples factory.
Methods Documentation
- classmethod from_dataset(dataset: Dataset, **kwargs) TextRepresentation[source]
Prepare text representation with labels from a dataset.
- Parameters:
dataset (Dataset) – the dataset
kwargs – additional keyword-based parameters passed to
TextRepresentation.from_triples_factory()
- Returns:
a text representation from the dataset
- Raises:
TypeError – if the dataset’s triples factory does not provide labels
- Return type:
- classmethod from_triples_factory(triples_factory: TriplesFactory, for_entities: bool = True, **kwargs) TextRepresentation[source]
Prepare a text representations with labels from a triples factory.
- Parameters:
triples_factory (TriplesFactory) – the triples factory
for_entities (bool) – whether to create the initializer for entities (or relations)
kwargs – additional keyword-based arguments passed to
TextRepresentation.__init__()
- Returns:
a text representation from the triples factory
- Return type: