Representation¶

Embedding modules.

class Embedding(num_embeddings, embedding_dim=None, shape=None, initializer=None, initializer_kwargs=None, normalizer=None, normalizer_kwargs=None, constrainer=None, constrainer_kwargs=None, regularizer=None, trainable=True, dtype=None)[source]¶

Trainable embeddings.

This class provides the same interface as torch.nn.Embedding and can be used throughout PyKEEN as a more fully featured drop-in replacement.

Instantiate an embedding with extended functionality.

Parameters

num_embeddings (int) – >0 The number of embeddings.
embedding_dim (Optional[int]) – >0 The embedding dimensionality.
initializer (Union[None, str, Callable[[FloatTensor], FloatTensor]]) – An optional initializer, which takes an uninitialized (num_embeddings, embedding_dim) tensor as input, and returns an initialized tensor of same shape and dtype (which may be the same, i.e. the initialization may be in-place)
initializer_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to the initializer
normalizer (Union[None, str, Callable[[FloatTensor], FloatTensor]]) – A normalization function, which is applied in every forward pass.
normalizer_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to the normalizer
constrainer (Union[None, str, Callable[[FloatTensor], FloatTensor]]) – A function which is applied to the weights after each parameter update, without tracking gradients. It may be used to enforce model constraints outside of gradient-based training. The function does not need to be in-place, but the weight tensor is modified in-place.
constrainer_kwargs (Optional[Mapping[str, Any]]) – Additional keyword arguments passed to the constrainer

property embedding_dim¶

The representation dimension.

Return type: int

forward(indices=None)[source]¶

Get representations for indices.

Parameters: indices (Optional[LongTensor]) – shape: s The indices, or None. If None, this is interpreted as torch.arange(self.max_id) (although implemented more efficiently).
Return type: FloatTensor
Returns: shape: (*s, *self.shape) The representations.

classmethod init_with_device(num_embeddings, embedding_dim, device, initializer=None, initializer_kwargs=None, normalizer=None, normalizer_kwargs=None, constrainer=None, constrainer_kwargs=None)[source]¶

Create an embedding object on the given device by wrapping __init__().

This method is a hotfix for not being able to pass a device during initialization of torch.nn.Embedding. Instead the weight is always initialized on CPU and has to be moved to GPU afterwards.

Return type: ForwardRef
Returns: The embedding.

property num_embeddings¶

The total number of representations (i.e. the maximum ID).

Return type: int

post_parameter_update()[source]¶: Apply constraints which should not be included in gradients.

reset_parameters()[source]¶

Reset the module’s parameters.

Return type: None

class EmbeddingSpecification(embedding_dim=None, shape=None, initializer=None, initializer_kwargs=None, normalizer=None, normalizer_kwargs=None, constrainer=None, constrainer_kwargs=None, regularizer=None, dtype=None)[source]¶

An embedding specification.

make(*, num_embeddings, device=None)[source]¶

Create an embedding with this specification.

Return type: Embedding

class RepresentationModule(max_id, shape)[source]¶

A base class for obtaining representations for entities/relations.

A representation module maps integer IDs to representations, which are tensors of floats.

max_id defines the upper bound of indices we are allowed to request (exclusively). For simple embeddings this is equivalent to num_embeddings, but more a more appropriate word for general non-embedding representations, where the representations could come from somewhere else, e.g. a GNN encoder.

shape describes the shape of a single representation. In case of a vector embedding, this is just a single dimension. For others, e.g. pykeen.models.RESCAL, we have 2-d representations, and in general it can be any fixed shape.

We can look at all representations as a tensor of shape (max_id, *shape), and this is exactly the result of passing indices=None to the forward method.

We can also pass multi-dimensional indices to the forward method, in which case the indices’ shape becomes the prefix of the result shape: (*indices.shape, *self.shape).

Initialize the representation module.

Parameters

max_id (int) – The maximum ID (exclusively). Valid Ids reach from 0, …, max_id-1
shape (Sequence[int]) – The shape of an individual representation.

property embedding_dim¶

Return the “embedding dimension”. Kept for backward compatibility.

Return type: int

abstract forward(indices=None)[source]¶

Get representations for indices.

Parameters: indices (Optional[LongTensor]) – shape: s The indices, or None. If None, this is interpreted as torch.arange(self.max_id) (although implemented more efficiently).
Return type: FloatTensor
Returns: shape: (*s, *self.shape) The representations.

get_in_canonical_shape(indices=None)[source]¶

Get representations in canonical shape.

Parameters: indices (Optional[LongTensor]) – None, shape: (b,) or (b, n) The indices. If None, return all representations.
Return type: FloatTensor
Returns: shape: (b?, n?, d) If indices is None, b=1, n=max_id. If indices is 1-dimensional, b=indices.shape[0] and n=1. If indices is 2-dimensional, b, n = indices.shape

get_in_more_canonical_shape(dim, indices=None)[source]¶

Get representations in canonical shape.

The canonical shape is given as

(batch_size, d_1, d_2, d_3, *)

fulfilling the following properties:

Let i = dim. If indices is None, the return shape is (1, d_1, d_2, d_3) with d_i = num_representations, d_i = 1 else. If indices is not None, then batch_size = indices.shape[0], and d_i = 1 if indices.ndimension() = 1 else d_i = indices.shape[1]

The canonical shape is given by (batch_size, 1, *) if indices is not None, where batch_size=len(indices), or (1, num, *) if indices is None with num equal to the total number of embeddings.

Examples: >>> emb = EmbeddingSpecification(shape=(20,)).make(num_embeddings=10) >>> # Get head representations for given batch indices >>> emb.get_in_more_canonical_shape(dim=”h”, indices=torch.arange(5)).shape (5, 1, 1, 1, 20) >>> # Get head representations for given 2D batch indices, as e.g. used by fast slcwa scoring >>> emb.get_in_more_canonical_shape(dim=”h”, indices=torch.arange(6).view(2, 3)).shape (2, 3, 1, 1, 20) >>> # Get head representations for 1:n scoring >>> emb.get_in_more_canonical_shape(dim=”h”, indices=None).shape (1, 10, 1, 1, 20)

Parameters

dim (Union[int, str]) – The dimension along which to expand for indices=None, or indices.ndimension() == 2.
indices (Optional[LongTensor]) – The indices. Either None, in which care all embeddings are returned, or a 1 or 2 dimensional index tensor.

Return type

FloatTensor

Returns

shape: (batch_size, d1, d2, d3, *self.shape)

max_id: int¶: the maximum ID (exclusively)

post_parameter_update()[source]¶: Apply constraints which should not be included in gradients.

reset_parameters()[source]¶

Reset the module’s parameters.

Return type: None

shape: Tuple[int, …]¶: the shape of an individual representation