TransformerInteraction
- class TransformerInteraction(input_dim=512, num_layers=2, num_heads=8, dropout=0.1, dim_feedforward=2048, position_initializer=<function xavier_normal_>)[source]
Bases:
FunctionalInteraction[FloatTensor,FloatTensor,FloatTensor]Transformer-based interaction, as described in [galkin2020].
Initialize the module.
- Parameters:
input_dim (
int) – >0 the input dimensionnum_layers (
int) – >0 the number of Transformer layers, cf.nn.TransformerEncoder.num_heads (
int) – >0 the number of self-attention heads inside each transformer encoder layer, cf.nn.TransformerEncoderLayerdropout (
float) – the dropout rate on each transformer encoder layer, cf.nn.TransformerEncoderLayerdim_feedforward (
int) – the hidden dimension of the feed-forward layers of the transformer encoder layer, cf.nn.TransformerEncoderLayerposition_initializer (
Union[str,Callable[[FloatTensor],FloatTensor],Type[Callable[[FloatTensor],FloatTensor]],None]) – the initializer to use for positional embeddings
Methods Summary
func(r, t, transformer, position_embeddings, ...)Evaluate the Transformer interaction function, as described in [galkin2020]..
Methods Documentation
- func(r, t, transformer, position_embeddings, final)
Evaluate the Transformer interaction function, as described in [galkin2020]..
\[\textit{score}(h, r, t) = \textit{Linear}(\textit{SumPooling}(\textit{Transformer}([h + pe[0]; r + pe[1]])))^T t\]- Parameters:
h (
FloatTensor) – shape: (*batch_dims, dim) The head representations.r (
FloatTensor) – shape: (*batch_dims, dim) The relation representations.t (
FloatTensor) – shape: (*batch_dims, dim) The tail representations.transformer (
TransformerEncoder) – the transformer encoderposition_embeddings (
FloatTensor) – shape: (2, dim) the positional embeddings, one for head and one for relationfinal (
Module) – the final (linear) transformation
- Return type:
FloatTensor- Returns:
The scores.