TransformerInteraction

class TransformerInteraction(input_dim=512, num_layers=2, num_heads=8, dropout=0.1, dim_feedforward=2048, position_initializer=<function xavier_normal_>)[source]

Bases: FunctionalInteraction[FloatTensor, FloatTensor, FloatTensor]

Transformer-based interaction, as described in [galkin2020].

Initialize the module.

Parameters

input_dim (int) – >0 the input dimension
num_layers (int) – >0 the number of Transformer layers, cf. nn.TransformerEncoder.
num_heads (int) – >0 the number of self-attention heads inside each transformer encoder layer, cf. nn.TransformerEncoderLayer
dropout (float) – the dropout rate on each transformer encoder layer, cf. nn.TransformerEncoderLayer
dim_feedforward (int) – the hidden dimension of the feed-forward layers of the transformer encoder layer, cf. nn.TransformerEncoderLayer
position_initializer (Union[str, Callable[[FloatTensor], FloatTensor], Type[Callable[[FloatTensor], FloatTensor]], None]) – the initializer to use for positional embeddings

Methods Summary

func(r, t, transformer, position_embeddings, ...)

Evaluate the Transformer interaction function, as described in [galkin2020]..

Methods Documentation

func(r, t, transformer, position_embeddings, final)

Evaluate the Transformer interaction function, as described in [galkin2020]..

\[\textit{score}(h, r, t) = \textit{Linear}(\textit{SumPooling}(\textit{Transformer}([h + pe[0]; r + pe[1]])))^T t\]

Parameters

h (FloatTensor) – shape: (*batch_dims, dim) The head representations.
r (FloatTensor) – shape: (*batch_dims, dim) The relation representations.
t (FloatTensor) – shape: (*batch_dims, dim) The tail representations.
transformer (TransformerEncoder) – the transformer encoder
position_embeddings (FloatTensor) – shape: (2, dim) the positional embeddings, one for head and one for relation
final (Module) – the final (linear) transformation

Return type

FloatTensor

Returns

The scores.