Initialization

Embedding weight initialization routines.

class LabelBasedInitializer(labels, pretrained_model_name_or_path='bert-base-cased', batch_size=32, max_length=None)[source]

An initializer using pretrained models from the transformers library to encode labels.

Example Usage:

Initialize entity representations as Transformer encodings of their labels. Afterwards, the parameters are detached from the labels, and trained on the KGE task without any further connection to the Transformer model.

from pykeen.datasets import get_dataset
from pykeen.nn.init import LabelBasedInitializer
from pykeen.models import ERMLPE

dataset = get_dataset(dataset="nations")
model = ERMLPE(
    embedding_dim=768,  # for BERT base
    entity_initializer=LabelBasedInitializer.from_triples_factory(
        triples_factory=dataset.training,
    ),
)

Initialize the initializer.

Parameters

labels (Sequence[str]) – the labels
pretrained_model_name_or_path (str) – the name of the pretrained model, or a path, cf. transformers.AutoModel.from_pretrained()
batch_size (int) – >0 the batch size to use while encoding.
max_length (Optional[int]) – >0 the maximum number of tokens to pad/trim the labels to

Raises

ImportError – if the transformers library could not be imported

classmethod from_triples_factory(triples_factory, for_entities=True, **kwargs)[source]

Prepare a label-based initializer with labels from a triples factory.

Parameters

triples_factory (TriplesFactory) – the triples factory
for_entities (bool) – whether to create the initializer for entities (or relations)
kwargs – additional keyword-based arguments passed to LabelBasedInitializer.__init__()

Return type

LabelBasedInitializer

Returns

A label-based initializer

Raises

ImportError – if the transformers library could not be imported

class PretrainedInitializer(tensor)[source]

Initialize tensor with pretrained weights.

Example usage:

import torch
from pykeen.pipeline import pipeline
from pykeen.nn.init import create_init_from_pretrained

# this is usually loaded from somewhere else
# the shape must match, as well as the entity-to-id mapping
pretrained_embedding_tensor = torch.rand(14, 128)

result = pipeline(
    dataset="nations",
    model="transe",
    model_kwargs=dict(
        embedding_dim=pretrained_embedding_tensor.shape[-1],
        entity_initializer=PretrainedInitializer(tensor=pretrained_embedding_tensor),
    ),
)

Initialize the initializer.

Parameters: tensor (FloatTensor) – the tensor of pretrained embeddings.

init_phases(x)[source]

Generate random phases between 0 and \(2\pi\).

Return type: Tensor

xavier_normal_(tensor, gain=1.0)[source]

Initialize weights of the tensor similarly to Glorot/Xavier initialization.

Proceed as if it was a linear layer with fan_in of zero and Xavier normal initialization is used. Fill the weight of input embedding with values values sampled from \(\mathcal{N}(0, a^2)\) where

\[a = \text{gain} \times \sqrt{\frac{2}{\text{embedding_dim}}}\]

Parameters

tensor (Tensor) – A tensor
gain (float) – An optional scaling factor, defaults to 1.0.

Return type

Tensor

Returns

Embedding with weights by the Xavier normal initializer.

xavier_uniform_(tensor, gain=1.0)[source]

Initialize weights of the tensor similarly to Glorot/Xavier initialization.

Proceed as if it was a linear layer with fan_in of zero and Xavier uniform initialization is used, i.e. fill the weight of input embedding with values values sampled from \(\mathcal{U}(-a, a)\) where

\[a = \text{gain} \times \sqrt{\frac{6}{\text{embedding_dim}}}\]

Parameters

tensor – A tensor
gain (float) – An optional scaling factor, defaults to 1.0.

Returns

Embedding with weights by the Xavier uniform initializer.