Representation¶
Embedding modules.
-
class
Embedding
(num_embeddings, embedding_dim=None, shape=None, initializer=None, initializer_kwargs=None, normalizer=None, normalizer_kwargs=None, constrainer=None, constrainer_kwargs=None, regularizer=None, trainable=True, dtype=None)[source]¶ Trainable embeddings.
This class provides the same interface as
torch.nn.Embedding
and can be used throughout PyKEEN as a more fully featured drop-in replacement.Instantiate an embedding with extended functionality.
- Parameters
num_embeddings (
int
) – >0 The number of embeddings.embedding_dim (
Optional
[int
]) – >0 The embedding dimensionality.initializer (
Union
[None
,str
,Callable
[[FloatTensor
],FloatTensor
]]) – An optional initializer, which takes an uninitialized (num_embeddings, embedding_dim) tensor as input, and returns an initialized tensor of same shape and dtype (which may be the same, i.e. the initialization may be in-place)initializer_kwargs (
Optional
[Mapping
[str
,Any
]]) – Additional keyword arguments passed to the initializernormalizer (
Union
[None
,str
,Callable
[[FloatTensor
],FloatTensor
]]) – A normalization function, which is applied in every forward pass.normalizer_kwargs (
Optional
[Mapping
[str
,Any
]]) – Additional keyword arguments passed to the normalizerconstrainer (
Union
[None
,str
,Callable
[[FloatTensor
],FloatTensor
]]) – A function which is applied to the weights after each parameter update, without tracking gradients. It may be used to enforce model constraints outside of gradient-based training. The function does not need to be in-place, but the weight tensor is modified in-place.constrainer_kwargs (
Optional
[Mapping
[str
,Any
]]) – Additional keyword arguments passed to the constrainer
-
forward
(indices=None)[source]¶ Get representations for indices.
- Parameters
indices (
Optional
[LongTensor
]) – shape: s The indices, or None. If None, this is interpreted astorch.arange(self.max_id)
(although implemented more efficiently).- Return type
FloatTensor
- Returns
shape: (
*s
,*self.shape
) The representations.
-
classmethod
init_with_device
(num_embeddings, embedding_dim, device, initializer=None, initializer_kwargs=None, normalizer=None, normalizer_kwargs=None, constrainer=None, constrainer_kwargs=None)[source]¶ Create an embedding object on the given device by wrapping
__init__()
.This method is a hotfix for not being able to pass a device during initialization of
torch.nn.Embedding
. Instead the weight is always initialized on CPU and has to be moved to GPU afterwards.- Return type
- Returns
The embedding.
-
class
EmbeddingSpecification
(embedding_dim=None, shape=None, initializer=None, initializer_kwargs=None, normalizer=None, normalizer_kwargs=None, constrainer=None, constrainer_kwargs=None, regularizer=None, dtype=None)[source]¶ An embedding specification.
-
class
RepresentationModule
(max_id, shape)[source]¶ A base class for obtaining representations for entities/relations.
A representation module maps integer IDs to representations, which are tensors of floats.
max_id defines the upper bound of indices we are allowed to request (exclusively). For simple embeddings this is equivalent to num_embeddings, but more a more appropriate word for general non-embedding representations, where the representations could come from somewhere else, e.g. a GNN encoder.
shape describes the shape of a single representation. In case of a vector embedding, this is just a single dimension. For others, e.g.
pykeen.models.RESCAL
, we have 2-d representations, and in general it can be any fixed shape.We can look at all representations as a tensor of shape (max_id, *shape), and this is exactly the result of passing indices=None to the forward method.
We can also pass multi-dimensional indices to the forward method, in which case the indices’ shape becomes the prefix of the result shape: (*indices.shape, *self.shape).
Initialize the representation module.
- Parameters
-
property
embedding_dim
¶ Return the “embedding dimension”. Kept for backward compatibility.
- Return type
-
abstract
forward
(indices=None)[source]¶ Get representations for indices.
- Parameters
indices (
Optional
[LongTensor
]) – shape: s The indices, or None. If None, this is interpreted astorch.arange(self.max_id)
(although implemented more efficiently).- Return type
FloatTensor
- Returns
shape: (
*s
,*self.shape
) The representations.
-
get_in_canonical_shape
(indices=None)[source]¶ Get representations in canonical shape.
- Parameters
indices (
Optional
[LongTensor
]) – None, shape: (b,) or (b, n) The indices. If None, return all representations.- Return type
FloatTensor
- Returns
shape: (b?, n?, d) If indices is None, b=1, n=max_id. If indices is 1-dimensional, b=indices.shape[0] and n=1. If indices is 2-dimensional, b, n = indices.shape
-
get_in_more_canonical_shape
(dim, indices=None)[source]¶ Get representations in canonical shape.
The canonical shape is given as
(batch_size, d_1, d_2, d_3,
*
)fulfilling the following properties:
Let i = dim. If indices is None, the return shape is (1, d_1, d_2, d_3) with d_i = num_representations, d_i = 1 else. If indices is not None, then batch_size = indices.shape[0], and d_i = 1 if indices.ndimension() = 1 else d_i = indices.shape[1]
The canonical shape is given by (batch_size, 1,
*
) if indices is not None, where batch_size=len(indices), or (1, num,*
) if indices is None with num equal to the total number of embeddings.Examples: >>> emb = EmbeddingSpecification(shape=(20,)).make(num_embeddings=10) >>> # Get head representations for given batch indices >>> emb.get_in_more_canonical_shape(dim=”h”, indices=torch.arange(5)).shape (5, 1, 1, 1, 20) >>> # Get head representations for given 2D batch indices, as e.g. used by fast slcwa scoring >>> emb.get_in_more_canonical_shape(dim=”h”, indices=torch.arange(6).view(2, 3)).shape (2, 3, 1, 1, 20) >>> # Get head representations for 1:n scoring >>> emb.get_in_more_canonical_shape(dim=”h”, indices=None).shape (1, 10, 1, 1, 20)
- Parameters
- Return type
FloatTensor
- Returns
shape: (batch_size, d1, d2, d3,
*self.shape
)