TuckER

class TuckER(*, embedding_dim=200, relation_dim=None, dropout_0=0.3, dropout_1=0.4, dropout_2=0.5, apply_batch_normalization=True, entity_initializer=<function xavier_normal_>, relation_initializer=<function xavier_normal_>, core_tensor_initializer=None, core_tensor_initializer_kwargs=None, **kwargs)[source]

Bases: ERModel

An implementation of TuckEr from [balazevic2019].

TuckER is a linear model that is based on the tensor factorization method Tucker in which a three-mode tensor \(\mathfrak{X} \in \mathbb{R}^{I \times J \times K}\) is decomposed into a set of factor matrices \(\textbf{A} \in \mathbb{R}^{I \times P}\), \(\textbf{B} \in \mathbb{R}^{J \times Q}\), and \(\textbf{C} \in \mathbb{R}^{K \times R}\) and a core tensor \(\mathfrak{Z} \in \mathbb{R}^{P \times Q \times R}\) (of lower rank):

\[\mathfrak{X} \approx \mathfrak{Z} \times_1 \textbf{A} \times_2 \textbf{B} \times_3 \textbf{C}\]

where \(\times_n\) is the tensor product, with \(n\) denoting along which mode the tensor product is computed. In TuckER, a knowledge graph is considered as a binary tensor which is factorized using the Tucker factorization where \(\textbf{E} = \textbf{A} = \textbf{C} \in \mathbb{R}^{n_{e} \times d_e}\) denotes the entity embedding matrix, \(\textbf{R} = \textbf{B} \in \mathbb{R}^{n_{r} \times d_r}\) represents the relation embedding matrix, and \(\mathfrak{W} = \mathfrak{Z} \in \mathbb{R}^{d_e \times d_r \times d_e}\) is the core tensor that indicates the extent of interaction between the different factors. The interaction model is defined as:

\[f(h,r,t) = \mathfrak{W} \times_1 \textbf{h} \times_2 \textbf{r} \times_3 \textbf{t}\]

where \(\textbf{h},\textbf{t}\) correspond to rows of \(\textbf{E}\) and \(\textbf{r}\) to a row of \(\textbf{R}\).

The dropout values correspond to the following dropouts in the model’s score function:

\[\text{Dropout}_2(BN(\text{Dropout}_0(BN(h)) \times_1 \text{Dropout}_1(W \times_2 r))) \times_3 t\]

where h,r,t are the head, relation, and tail embedding, W is the core tensor, times_i denotes the tensor product along the i-th mode, BN denotes batch normalization, and \(\text{Dropout}\) dropout.

Initialize the model.

Parameters:
  • embedding_dim (int) – the (entity) embedding dimension

  • relation_dim (Optional[int]) – the relation embedding dimension. Defaults to embedding_dim.

  • dropout_0 (float) – the first dropout, cf. formula

  • dropout_1 (float) – the second dropout, cf. formula

  • dropout_2 (float) – the third dropout, cf. formula

  • apply_batch_normalization (bool) – whether to apply batch normalization

  • entity_initializer (Union[str, Callable[[FloatTensor], FloatTensor], None]) – the entity representation initializer

  • relation_initializer (Union[str, Callable[[FloatTensor], FloatTensor], None]) – the relation representation initializer

  • core_tensor_initializer (Union[str, Callable[[FloatTensor], FloatTensor], None]) – the core tensor initializer

  • core_tensor_initializer_kwargs (Optional[Mapping[str, Any]]) – keyword-based parameters passed to the core tensor initializer

  • kwargs – additional keyword-based parameters passed to ERModel.__init__()

Attributes Summary

hpo_default

The default strategy for optimizing the model's hyper-parameters

loss_default_kwargs

The default parameters for the default loss function class

Attributes Documentation

hpo_default: ClassVar[Mapping[str, Any]] = {'dropout_0': {'high': 0.5, 'low': 0.0, 'q': 0.1, 'type': <class 'float'>}, 'dropout_1': {'high': 0.5, 'low': 0.0, 'q': 0.1, 'type': <class 'float'>}, 'dropout_2': {'high': 0.5, 'low': 0.0, 'q': 0.1, 'type': <class 'float'>}, 'embedding_dim': {'high': 256, 'low': 16, 'q': 16, 'type': <class 'int'>}, 'relation_dim': {'high': 256, 'low': 16, 'q': 16, 'type': <class 'int'>}}

The default strategy for optimizing the model’s hyper-parameters

loss_default_kwargs: ClassVar[Mapping[str, Any]] = {}

The default parameters for the default loss function class