NTN¶
- class NTN(triples_factory, embedding_dim=100, automatic_memory_optimization=None, num_slices=4, loss=None, preferred_device=None, random_seed=None, non_linearity=None, regularizer=None)[source]¶
Bases:
pykeen.models.base.EntityEmbeddingModel
An implementation of NTN from [socher2013].
NTN uses a bilinear tensor layer instead of a standard linear neural network layer:
\[f(h,r,t) = \textbf{u}_{r}^{T} \cdot \tanh(\textbf{h} \mathfrak{W}_{r} \textbf{t} + \textbf{V}_r [\textbf{h};\textbf{t}] + \textbf{b}_r)\]where \(\mathfrak{W}_r \in \mathbb{R}^{d \times d \times k}\) is the relation specific tensor, and the weight matrix \(\textbf{V}_r \in \mathbb{R}^{k \times 2d}\), and the bias vector \(\textbf{b}_r\) and the weight vector \(\textbf{u}_r \in \mathbb{R}^k\) are the standard parameters of a neural network, which are also relation specific. The result of the tensor product \(\textbf{h} \mathfrak{W}_{r} \textbf{t}\) is a vector \(\textbf{x} \in \mathbb{R}^k\) where each entry \(x_i\) is computed based on the slice \(i\) of the tensor \(\mathfrak{W}_{r}\): \(\textbf{x}_i = \textbf{h}\mathfrak{W}_{r}^{i} \textbf{t}\). As indicated by the interaction model, NTN defines for each relation a separate neural network which makes the model very expressive, but at the same time computationally expensive.
See also
Original Implementation (Matlab): https://github.com/khurram18/NeuralTensorNetworks
TensorFlow: https://github.com/dddoss/tensorflow-socher-ntn
Keras: https://github.com/dapurv5/keras-neural-tensor-layer(Keras)
Initialize NTN.
- Parameters
embedding_dim (
int
) – The entity embedding dimension \(d\). Is usually \(d \in [50, 350]\).num_slices (
int
) –non_linearity (
Optional
[Module
]) – A non-linear activation function. Defaults to the hyperbolic tangenttorch.nn.Tanh
.
Attributes Summary
The default strategy for optimizing the model’s hyper-parameters
Methods Summary
score_h
(rt_batch[, slice_size])Forward pass using left side (head) prediction.
score_hrt
(hrt_batch)Forward pass.
score_t
(hr_batch[, slice_size])Forward pass using right side (tail) prediction.
Attributes Documentation
- hpo_default: ClassVar[Mapping[str, Any]] = {'embedding_dim': {'high': 350, 'low': 50, 'q': 25, 'type': <class 'int'>}, 'num_slices': {'high': 4, 'low': 2, 'type': <class 'int'>}}¶
The default strategy for optimizing the model’s hyper-parameters
Methods Documentation
- score_h(rt_batch, slice_size=None)[source]¶
Forward pass using left side (head) prediction.
This method calculates the score for all possible heads for each (relation, tail) pair.
- Parameters
rt_batch (
LongTensor
) – shape: (batch_size, 2), dtype: long The indices of (relation, tail) pairs.- Return type
FloatTensor
- Returns
shape: (batch_size, num_entities), dtype: float For each r-t pair, the scores for all possible heads.
- score_hrt(hrt_batch)[source]¶
Forward pass.
This method takes head, relation and tail of each triple and calculates the corresponding score.
- Parameters
hrt_batch (
LongTensor
) – shape: (batch_size, 3), dtype: long The indices of (head, relation, tail) triples.- Raises
NotImplementedError – If the method was not implemented for this class.
- Return type
FloatTensor
- Returns
shape: (batch_size, 1), dtype: float The score for each triple.
- score_t(hr_batch, slice_size=None)[source]¶
Forward pass using right side (tail) prediction.
This method calculates the score for all possible tails for each (head, relation) pair.
- Parameters
hr_batch (
LongTensor
) – shape: (batch_size, 2), dtype: long The indices of (head, relation) pairs.- Return type
FloatTensor
- Returns
shape: (batch_size, num_entities), dtype: float For each h-r pair, the scores for all possible tails.