NodePiece

class NodePiece(*, triples_factory, num_tokens=2, tokenizers=None, tokenizers_kwargs=None, embedding_dim=64, interaction=<class 'pykeen.nn.modules.DistMultInteraction'>, aggregation=None, entity_initializer=None, entity_normalizer=None, entity_constrainer=None, entity_regularizer=None, relation_initializer=None, relation_normalizer=None, relation_constrainer=None, relation_regularizer=None, **kwargs)[source]

Bases: ERModel

A wrapper which combines an interaction function with NodePiece entity representations from [galkin2021].

This model uses the pykeen.nn.NodePieceRepresentation instead of a typical pykeen.nn.representation.Embedding to more efficiently store representations.

Initialize the model.

Parameters:
  • triples_factory (CoreTriplesFactory) – the triples factory. Must have create_inverse_triples set to True.

  • num_tokens (Union[int, Sequence[int]]) – the number of relations to use to represent each entity, cf. pykeen.nn.NodePieceRepresentation.

  • tokenizers (Union[str, Tokenizer, Type[Tokenizer], None, Sequence[Union[str, Tokenizer, Type[Tokenizer], None]]]) – the tokenizer to use, cf. pykeen.nn.node_piece.tokenizer_resolver.

  • tokenizers_kwargs (Union[Mapping[str, Any], None, Sequence[Optional[Mapping[str, Any]]]]) – additional keyword-based parameters passed to the tokenizer upon construction.

  • embedding_dim (int) – the embedding dimension. Only used if embedding_specification is not given.

  • interaction (Union[str, Interaction, Type[Interaction], None]) – the interaction module, or a hint for it.

  • aggregation (Union[str, Callable[[Tensor, int], Tensor], None]) –

    aggregation of multiple token representations to a single entity representation. By default, this uses torch.mean(). If a string is provided, the module assumes that this refers to a top-level torch function, e.g. “mean” for torch.mean(), or “sum” for func:torch.sum. An aggregation can also have trainable parameters, .e.g., MLP(mean(MLP(tokens))) (cf. DeepSets from [zaheer2017]). In this case, the module has to be created outside of this component.

    Moreover, we support providing “mlp” as a shortcut to use the MLP aggregation version from [galkin2021].

    We could also have aggregations which result in differently shapes output, e.g. a concatenation of all token embeddings resulting in shape (num_tokens * d,). In this case, shape must be provided.

    The aggregation takes two arguments: the (batched) tensor of token representations, in shape (*, num_tokens, *dt), and the index along which to aggregate.

  • entity_initializer (Union[str, Callable[[FloatTensor], FloatTensor], None]) – a hint for initializing anchor embeddings

  • entity_normalizer (Union[str, Callable[[FloatTensor], FloatTensor], None]) – a hint for normalizing anchor embeddings

  • entity_constrainer (Union[str, Callable[[FloatTensor], FloatTensor], None]) – a hint for constraining anchor embeddings

  • entity_regularizer (Union[str, Regularizer, None]) – a hint for regularizing anchor embeddings

  • relation_initializer (Union[str, Callable[[FloatTensor], FloatTensor], None]) – a hint for initializing relation embeddings

  • relation_normalizer (Union[str, Callable[[FloatTensor], FloatTensor], None]) – a hint for normalizing relation embeddings

  • relation_constrainer (Union[str, Callable[[FloatTensor], FloatTensor], None]) – a hint for constraining relation embeddings

  • relation_regularizer (Union[str, Regularizer, None]) – a hint for regularizing relation embeddings

  • kwargs – additional keyword-based arguments passed to ERModel.__init__()

Raises:

ValueError – if the triples factory does not create inverse triples

Attributes Summary

hpo_default

The default strategy for optimizing the model's hyper-parameters

Attributes Documentation

hpo_default: ClassVar[Mapping[str, Any]] = {'embedding_dim': {'high': 256, 'low': 16, 'q': 16, 'type': <class 'int'>}}

The default strategy for optimizing the model’s hyper-parameters