TextRepresentation

class TextRepresentation(labels: Sequence[str | None], max_id: int | None = None, shape: int | Sequence[int] | None = None, encoder: str | TextEncoder | type[TextEncoder] | None = None, encoder_kwargs: Mapping[str, Any] | None = None, missing_action: Literal['blank', 'error'] = 'error', **kwargs: Any)[source]

Bases: Representation

Textual representations using a text encoder on labels.

Example Usage:

Entity representations are obtained by encoding the labels with a Transformer model. The transformer model becomes part of the KGE model, and its parameters are trained jointly.

from pykeen.datasets import get_dataset
from pykeen.nn.representation import TextRepresentation
from pykeen.models import ERModel

dataset = get_dataset(dataset="nations")
entity_representations = TextRepresentation.from_dataset(
    dataset=dataset,
    encoder="transformer",
)
model = ERModel(
    interaction="ermlp",
    entity_representations=entity_representations,
    relation_representations_kwargs=dict(shape=entity_representations.shape),
)

Initialize the representation.

Parameters:
  • labels (Sequence[str | None]) – an ordered, finite collection of labels

  • max_id (int | None) – the number of representations. If provided, has to match the number of labels

  • shape (OneOrSequence[int] | None) – The shape of an individual representation.

  • encoder (HintOrType[TextEncoder]) –

    the text encoder, or a hint thereof. This can be one of:

  • encoder_kwargs (OptionalKwargs) – keyword-based parameters used to instantiate the text encoder

  • missing_action (Literal['blank', 'error']) – Which policy for handling nones in the given labels. If “error”, raises an error on any nones. If “blank”, replaces nones with an empty string.

  • kwargs (Any) – additional keyword-based parameters passed to Representation.__init__()

Raises:

ValueError – if the max_id does not match

Note

The parameter pair (encoder, encoder_kwargs) is used for text_encoder_resolver

An explanation of resolvers and how to use them is given in https://class-resolver.readthedocs.io/en/latest/.

Methods Summary

from_dataset(dataset, **kwargs)

Prepare text representation with labels from a dataset.

from_triples_factory(triples_factory[, ...])

Prepare a text representations with labels from a triples factory.

Methods Documentation

classmethod from_dataset(dataset: Dataset, **kwargs) TextRepresentation[source]

Prepare text representation with labels from a dataset.

Parameters:
Returns:

a text representation from the dataset

Raises:

TypeError – if the dataset’s triples factory does not provide labels

Return type:

TextRepresentation

classmethod from_triples_factory(triples_factory: TriplesFactory, for_entities: bool = True, **kwargs) TextRepresentation[source]

Prepare a text representations with labels from a triples factory.

Parameters:
  • triples_factory (TriplesFactory) – the triples factory

  • for_entities (bool) – whether to create the initializer for entities (or relations)

  • kwargs – additional keyword-based arguments passed to TextRepresentation.__init__()

Returns:

a text representation from the triples factory

Return type:

TextRepresentation