NTN¶

class NTN(triples_factory, embedding_dim=100, num_slices=4, loss=None, preferred_device=None, random_seed=None, non_linearity=None, regularizer=None)[source]¶

Bases: pykeen.models.base.EntityEmbeddingModel

An implementation of NTN from [socher2013].

NTN uses a bilinear tensor layer instead of a standard linear neural network layer:

\[f(h,r,t) = \textbf{u}_{r}^{T} \cdot \tanh(\textbf{h} \mathfrak{W}_{r} \textbf{t} + \textbf{V}_r [\textbf{h};\textbf{t}] + \textbf{b}_r)\]

where \(\mathfrak{W}_r \in \mathbb{R}^{d \times d \times k}\) is the relation specific tensor, and the weight matrix \(\textbf{V}_r \in \mathbb{R}^{k \times 2d}\), and the bias vector \(\textbf{b}_r\) and the weight vector \(\textbf{u}_r \in \mathbb{R}^k\) are the standard parameters of a neural network, which are also relation specific. The result of the tensor product \(\textbf{h} \mathfrak{W}_{r} \textbf{t}\) is a vector \(\textbf{x} \in \mathbb{R}^k\) where each entry \(x_i\) is computed based on the slice \(i\) of the tensor \(\mathfrak{W}_{r}\): \(\textbf{x}_i = \textbf{h}\mathfrak{W}_{r}^{i} \textbf{t}\). As indicated by the interaction model, NTN defines for each relation a separate neural network which makes the model very expressive, but at the same time computationally expensive.

See also

Original Implementation (Matlab): https://github.com/khurram18/NeuralTensorNetworks
TensorFlow: https://github.com/dddoss/tensorflow-socher-ntn
Keras: https://github.com/dapurv5/keras-neural-tensor-layer(Keras)

Initialize NTN.

Parameters

embedding_dim (int) – The entity embedding dimension \(d\). Is usually \(d \in [50, 350]\).
num_slices (int) –
non_linearity (Optional[Module]) – A non-linear activation function. Defaults to the hyperbolic tangent torch.nn.Tanh.

Attributes Summary

hpo_default

The default strategy for optimizing the model’s hyper-parameters

Methods Summary

`score_h`(rt_batch[, slice_size])	Forward pass using left side (head) prediction.
`score_hrt`(hrt_batch)	Forward pass.
`score_t`(hr_batch[, slice_size])	Forward pass using right side (tail) prediction.

Attributes Documentation

hpo_default: ClassVar[Mapping[str, Any]] = {'embedding_dim': {'high': 256, 'low': 16, 'q': 16, 'type': <class 'int'>}, 'num_slices': {'high': 4, 'low': 2, 'type': <class 'int'>}}¶: The default strategy for optimizing the model’s hyper-parameters

Methods Documentation

score_h(rt_batch, slice_size=None)[source]¶

Forward pass using left side (head) prediction.

This method calculates the score for all possible heads for each (relation, tail) pair.

Parameters: rt_batch (LongTensor) – shape: (batch_size, 2), dtype: long The indices of (relation, tail) pairs.
Return type: FloatTensor
Returns: shape: (batch_size, num_entities), dtype: float For each r-t pair, the scores for all possible heads.

score_hrt(hrt_batch)[source]¶

Forward pass.

This method takes head, relation and tail of each triple and calculates the corresponding score.

Parameters: hrt_batch (LongTensor) – shape: (batch_size, 3), dtype: long The indices of (head, relation, tail) triples.
Raises: NotImplementedError – If the method was not implemented for this class.
Return type: FloatTensor
Returns: shape: (batch_size, 1), dtype: float The score for each triple.

score_t(hr_batch, slice_size=None)[source]¶

Forward pass using right side (tail) prediction.

This method calculates the score for all possible tails for each (head, relation) pair.

Parameters: hr_batch (LongTensor) – shape: (batch_size, 2), dtype: long The indices of (head, relation) pairs.
Return type: FloatTensor
Returns: shape: (batch_size, num_entities), dtype: float For each h-r pair, the scores for all possible tails.