BackfillRepresentation

class BackfillRepresentation(max_id, base_ids, base=None, base_kwargs=None, backfill=None, backfill_kwargs=None, **kwargs)[source]

Bases: PartitionRepresentation

A variant of a partition representation that is easily applicable to a single base representation.

Similarly to the PartitionRepresentation representation example, we start by creating the representation for those entities where we have labels:

>>> from pykeen.nn import Embedding, init
>>> num_entities = 5
>>> labels = {1: "a first description", 4: "a second description"}
>>> label_initializer = init.LabelBasedInitializer(labels=list(labels.values()))
>>> shape = label_initializer.tensor.shape[1:]
>>> label_repr = Embedding(max_id=len(labels), shape=shape, initializer=label_initializer, trainable=False)

Next, we directly create representations for the remaining ones using the backfill representation. To do this, we need to create an iterable (e.g., a set) of all of the entity IDs that are in the base representation. Then, the assignments to the base representation and an auxillary representation are automatically generated for the base class

>>> from pykeen.nn import BackfillRepresentation
>>> entity_repr = BackfillRepresentation(base_ids=set(labels), max_id=num_entities, base=label_repr)

For brevity, we use here randomly generated triples factories instead of the actual data >>> from pykeen.triples.generation import generate_triples_factory >>> training = generate_triples_factory(num_entities=num_entities, num_relations=5, num_triples=31) >>> testing = generate_triples_factory(num_entities=num_entities, num_relations=5, num_triples=17) The combined representation can now be used as any other representation, e.g., to train a DistMult model: >>> from pykeen.pipeline import pipeline >>> from pykeen.models import ERModel >>> pipeline( … model=ERModel, … interaction=”distmult”, … model_kwargs=dict( … entity_representation=entity_repr, … relation_representation_kwargs=dict(shape=shape), … ), … training=training, … testing=testing, … )

Initialize the representation.

Parameters:
  • max_id (int) – The total number of entities that need to be embedded

  • base_ids (Iterable[int]) – An iterable of integer entity indexes which are provided through the base representations

  • base (Union[str, Representation, Type[Representation], None]) – the base representation, or a hint thereof.

  • base_kwargs (Optional[Mapping[str, Any]]) – keyword-based parameters to instantiate the base representation

  • backfill (Union[str, Representation, Type[Representation], None]) – the backfill representation, or hints thereof.

  • backfill_kwargs (Optional[Mapping[str, Any]]) – keyword-based parameters to instantiate the backfill representation

  • kwargs – additional keyword-based parameters passed to Representation.__init__(). May not contain max_id, or shape, which are inferred from the base representations.