ConvEInteraction

class ConvEInteraction(input_channels: int | None = None, output_channels: int = 32, embedding_height: int | None = None, embedding_width: int | None = None, kernel_width: int = 3, kernel_height: int | None = None, input_dropout: float = 0.2, feature_map_dropout: float = 0.2, output_dropout: float = 0.3, embedding_dim: int = 200, apply_batch_normalization: bool = True)[source]

Bases: Interaction[Tensor, Tensor, tuple[Tensor, Tensor]]

The stateful ConvE interaction function.

ConvE is a CNN-based approach. For input representations \(\mathbf{h}, \mathbf{r}, \mathbf{t} \in \mathbb{R}^d\), it first combines \(\mathbf{h}\) and \(\mathbf{r}\) into a matrix matrix \(\mathbf{A} \in \mathbb{R}^{2 \times d}\), where the first row of \(\mathbf{A}\) represents \(\mathbf{h}\) and the second row represents \(\mathbf{r}\). \(\mathbf{A}\) is reshaped to a matrix \(\mathbf{B} \in \mathbb{R}^{m \times n}\) where the first \(m/2\) half rows represent \(\mathbf{h}\) and the remaining \(m/2\) half rows represent \(\mathbf{r}\). In the convolution layer, a set of 2-dimensional convolutional filters \(\Omega = \{\omega_i \mid \omega_i \in \mathbb{R}^{r \times c}\}\) are applied on \(\mathbf{B}\) that capture interactions between \(\mathbf{h}\) and \(\mathbf{r}\). The resulting feature maps are reshaped and concatenated in order to create a feature vector \(\mathbf{v} \in \mathbb{R}^{|\Omega|rc}\). In the next step, \(\mathbf{v}\) is mapped into the entity space using a linear transformation \(\mathbf{W} \in \mathbb{R}^{|\Omega|rc \times d}\), that is \(\mathbf{e}_{h,r} = \mathbf{v}^{T} \mathbf{W}\). The score is then obtained by:

\[f(\mathbf{h}, \mathbf{r}, \mathbf{t}) = \mathbf{e}_{h,r} \mathbf{t}\]

Since the interaction model can be decomposed into \(f(\mathbf{h}, \mathbf{r}, \mathbf{t}) = \left\langle f'(\mathbf{h}, \mathbf{r}), \mathbf{t} \right\rangle\) the model is particularly designed to 1-N scoring, i.e. efficient computation of scores for \((h,r,t)\) for fixed \(h,r\) and many different \(t\).

The default setting uses batch normalization. Batch normalization normalizes the output of the activation functions, in order to ensure that the weights of the NN don’t become imbalanced and to speed up training. However, batch normalization is not the only way to achieve more robust and effective training [santurkar2018]. Therefore, we added the flag apply_batch_normalization to turn batch normalization on/off (it’s turned on as default).

Initialize the interaction module.

Parameters:
  • input_channels (int | None) – the number of input channels for the convolution operation. Can be inferred from other parameters, cf. _calculate_missing_shape_information().

  • output_channels (int) – the number of input channels for the convolution operation

  • embedding_height (int | None) – the height of the “image” after reshaping the concatenated head and relation embedding. Can be inferred from other parameters, cf. _calculate_missing_shape_information().

  • embedding_width (int | None) – the width of the “image” after reshaping the concatenated head and relation embedding. Can be inferred from other parameters, cf. _calculate_missing_shape_information().

  • kernel_width (int) – the width of the convolution kernel

  • kernel_height (int | None) – the height of the convolution kernel. Defaults to kernel_width

  • input_dropout (float) – the dropout applied before the convolution

  • feature_map_dropout (float) – the dropout applied after the convolution

  • output_dropout (float) – the dropout applied after the linear projection

  • embedding_dim (int) – the embedding dimension of entities and relations

  • apply_batch_normalization (bool) – whether to apply batch normalization

Attributes Summary

entity_shape

The symbolic shapes for entity representations

Methods Summary

forward(h, r, t)

Evaluate the interaction function.

Attributes Documentation

entity_shape: Sequence[str] = ('d', '')

The symbolic shapes for entity representations

Methods Documentation

forward(h: Tensor, r: Tensor, t: tuple[Tensor, Tensor]) Tensor[source]

Evaluate the interaction function.

See also

Interaction.forward for a detailed description about the generic batched form of the interaction function.

Parameters:
  • h (Tensor) – shape: (*batch_dims, d) The head representations.

  • r (Tensor) – shape: (*batch_dims, d) The relation representations.

  • t (tuple[Tensor, Tensor]) – two vectors of shape: (*batch_dims, d) and batch_dims The tail representations, comprising the tail entity embedding and bias.

Returns:

shape: batch_dims The scores.

Return type:

Tensor