Embedding

class Embedding(max_id: int | None = None, num_embeddings: int | None = None, embedding_dim: int | None = None, shape: None | int | Sequence[int] = None, initializer: str | Callable[[Tensor], Tensor] | None = None, initializer_kwargs: Mapping[str, Any] | None = None, constrainer: str | Callable[[Tensor], Tensor] | None = None, constrainer_kwargs: Mapping[str, Any] | None = None, trainable: bool = True, dtype: dtype | None = None, **kwargs)[source]

Bases: Representation

Trainable embeddings.

This class provides the same interface as torch.nn.Embedding and can be used throughout PyKEEN as a more complete drop-in replacement.

It extends it by adding additional options to normalize, constrain, or apply drop-out.

Note

A discussion about the differences between normalizers and constrainers can be found in Normalizer, Constrainer & Regularizer.

The optional dropout can also be used as a regularization technique. It also allows uncertainty estimates to be obtained using techniques such as Monte-Carlo dropout. The following simple example shows how to obtain different scores for a single triple from an (untrained) model. These scores can be viewed as samples from a distribution over the scores.

"""Monte-Carlo uncertainty estimation with embedding dropout."""

import torch

from pykeen.datasets import Nations
from pykeen.models import ERModel
from pykeen.typing import FloatTensor

dataset = Nations()
model: ERModel[FloatTensor, FloatTensor, FloatTensor] = ERModel(
    triples_factory=dataset.training,
    interaction="distmult",
    entity_representations_kwargs=dict(embedding_dim=3, dropout=0.1),
    relation_representations_kwargs=dict(embedding_dim=3, dropout=0.1),
)
batch = torch.as_tensor(data=[[0, 1, 0]]).repeat(10, 1)
scores = model.score_hrt(batch)

Instantiate an embedding with extended functionality.

Parameters:
  • max_id (int) – >0 The number of embeddings, cf. Representation.

  • num_embeddings (int | None) –

    >0 The number of embeddings.

    Note

    This argument is kept for backwards compatibility. New code should use max_id instead.

  • embedding_dim (int | None) – >0 The embedding dimensionality.

  • shape (tuple[int, ...]) –

    The shape of an individual representation, cf. Representation.

    Note

    You can pass exactly only one of embedding_dim and shape. shape is generally preferred because it is the more generic parameter also used in Representation, but the term embedding_dim is so ubiquitous that it is available as well.

  • initializer (Hint[Initializer]) – An optional initializer, which takes an uninitialized (max_id, *shape) tensor as input, and returns an initialized tensor of same shape and dtype (which may be the same, i.e. the initialization may be in-place). Can be passed as a function, or as string, cf. resolver note.

  • initializer_kwargs (Mapping[str, Any] | None) – Additional keyword arguments passed to the initializer

  • constrainer (Callable[[Tensor], Tensor] | None) – A function which is applied to the weights after each parameter update, without tracking gradients. It may be used to enforce model constraints outside gradient-based training. The function does not need to be in-place, but the weight tensor is modified in-place. Can be passed as a function, or as a string, cf. resolver note.

  • constrainer_kwargs (Mapping[str, Any] | None) – Additional keyword arguments passed to the constrainer

  • trainable (bool) – Should the wrapped embeddings be marked to require gradient.

  • dtype (torch.dtype | None) – The datatype (otherwise uses torch.get_default_dtype() to look up).

  • kwargs – Additional keyword-based parameters passed to Representation

Note

2 resolvers are used in this function.

An explanation of resolvers and how to use them is given in https://class-resolver.readthedocs.io/en/latest/.

Methods Summary

from_pretrained(tensor, *[, trainable])

Construct an embedding from a pre-trained tensor.

post_parameter_update()

Apply constraints which should not be included in gradients.

reset_parameters()

Reset the module's parameters.

Methods Documentation

classmethod from_pretrained(tensor: Tensor | PretrainedInitializer, *, trainable: bool = False, **kwargs: Any) Self[source]

Construct an embedding from a pre-trained tensor.

Parameters:
  • tensor (Tensor | PretrainedInitializer) – the tensor of pretrained embeddings, or pretrained initializer that wraps a tensor

  • trainable (bool) – should the embedding be trainable? defaults to false, since this constructor is typically used for making a static embedding.

  • kwargs (Any) – Remaining keyword arguments to pass to the pykeen.nn.Embedding constructor

Returns:

An embedding representation

Return type:

Self

post_parameter_update()[source]

Apply constraints which should not be included in gradients.

reset_parameters() None[source]

Reset the module’s parameters.

Return type:

None