TransformerInteraction

class TransformerInteraction(input_dim: int = 512, num_layers: int = 2, num_heads: int = 8, dropout: float = 0.1, dim_feedforward: int = 2048, position_initializer: str | ~typing.Callable[[~torch.Tensor], ~torch.Tensor] | type[~typing.Callable[[~torch.Tensor], ~torch.Tensor]] | None = <function xavier_normal_>)[source]

Bases: Interaction[Tensor, Tensor, Tensor]

Transformer-based interaction, as described in [galkin2020].

This interaction function is primarily designed to handle additional qualifier pairs found in hyper-relational statements, but can also be used for vanilla link prediction.

It creates a \(2\)-element sequence of the head and relation representations, applies a learnable absolute position encoding, applies a Transformer encoder, and subsequently performs sum pooling along the sequence dimension and a final linear projection before determining scores by the dot product with the tail entity representation.

Its interaction function is given by

\[\textit{Linear}(\textit{SumPooling}(\textit{Transformer}( [\mathbf{h} + \mathbf{pe}[0]; \mathbf{r} + \mathbf{pe}[1]] )))^T \mathbf{t}\]

Since a computationally expensive operation is applied to the concatenated head and relation representations, and a cheap dot product is applied between this encoding and the tail representation, this interaction function is particularly well suited for \(1:n\) evaluation of different tail entities for the same head-relation combination.

Initialize the module.

Parameters:
  • input_dim (int) – >0 The input dimension.

  • num_layers (int) – >0 The number of Transformer layers, cf. torch.nn.TransformerEncoder.

  • num_heads (int) – >0 The number of self-attention heads inside each transformer encoder layer, cf. nn.TransformerEncoderLayer.

  • dropout (float) – The dropout rate on each transformer encoder layer, cf. torch.nn.TransformerEncoderLayer.

  • dim_feedforward (int) – The hidden dimension of the feed-forward layers of the transformer encoder layer, cf. torch.nn.TransformerEncoderLayer.

  • position_initializer (HintOrType[Initializer]) – The initializer to use for positional embeddings.

Methods Summary

forward(h, r, t)

Evaluate the interaction function.

Methods Documentation

forward(h: Tensor, r: Tensor, t: Tensor) Tensor[source]

Evaluate the interaction function.

See also

Interaction.forward for a detailed description about the generic batched form of the interaction function.

Parameters:
  • h (Tensor) – shape: (*batch_dims, d) The head representations.

  • r (Tensor) – shape: (*batch_dims, d) The relation representations.

  • t (Tensor) – shape: (*batch_dims, d) The tail representations.

Returns:

shape: batch_dims The scores.

Return type:

Tensor