GatedCombination

class GatedCombination(entity_dim=32, literal_dim=None, input_dropout=0.0, gate_activation=<class 'torch.nn.modules.activation.Sigmoid'>, gate_activation_kwargs=None, hidden_activation=<class 'torch.nn.modules.activation.Tanh'>, hidden_activation_kwargs=None)[source]

Bases: Combination

A module that implements a gated linear transformation for the combination of entities and literals.

Compared to the other Combinations, this combination makes use of a gating mechanism commonly found in RNNs. The main goal of this gating mechanism is to learn which parts of the additional literal information is useful or not and act accordingly, by incorporating them into the new combined embedding or discarding them.

For given entity representation \(\mathbf{x}_e \in \mathbb{R}^{d_e}\) and literal representation \(\mathbf{x}_l \in \mathbb{R}^{d_l}\), the module calculates

\[z = f_{gate}(\mathbf{W}_e x_e + \mathbf{W}_l x_l + \mathbf{b}) h = f_{hidden}(\mathbf{W} [x_e; x_l]) y = Dropout(z \odot h + (1 - z) \odot x)\]

where \(\mathbf{W}_e \in \mathbb{R}^{d_e \times d_e}\),:math:mathbf{W}_l in mathbb{R}^{d_l times d_e}, \(\mathbf{W} \in \mathbb{R}^{(d_e + d_l) \ times d_e}\), and \(\mathbf{b} \in \mathbb{R}^{d_e}\) are trainable parameters, \(f_{gate}\) and \(f_{hidden}\) are activation functions, defaulting to sigmoid and tanh, \(\odot\) denotes the element-wise multiplication, and \([x_e; x_l]\) the concatenation operation.

Note

We can alternatively express the gate

\[z = f_{gate}(\mathbf{W}_e x_e + \mathbf{W}_l x_l + \mathbf{b})\]

as

\[z = f_{gate}(\mathbf{W}_{el} [x_e; x_l] + \mathbf{b})\]

with \(\mathbf{W}_{el} \in \mathbb{R}^{(d_e + d_l) \times d_e}\).

Implementation based on https://github.com/SmartDataAnalytics/LiteralE/blob/master/model.py Gate class.

Instantiate the module.

Parameters
  • entity_dim (int) – the dimension of the entity representations.

  • literal_dim (Optional[int]) – the dimension of the literals; defaults to entity_dim

  • input_dropout (float) – the dropout to use

  • gate_activation (Union[str, Module, Type[Module], None]) – the activation to use on the gate, or a hint thereof

  • gate_activation_kwargs (Optional[Mapping[str, Any]]) – the keyword arguments to be used to instantiate the gate_activation if a class or name is given instead of a pre-instantiated activation module

  • hidden_activation (Union[str, Module, Type[Module], None]) – the activation to use in the hidden layer, or a hint thereof

  • hidden_activation_kwargs (Optional[Mapping[str, Any]]) – the keyword arguments to be used to instantiate the hidden activation if a class or name is given instead of a pre-instantiated activation module

Methods Summary

forward(xs)

Combine a sequence of individual representations.

Methods Documentation

forward(xs)[source]

Combine a sequence of individual representations.

Parameters

xs (Sequence[FloatTensor]) – shape: (*batch_dims, *input_dims_i) the individual representations

Return type

FloatTensor

Returns

shape: (*batch_dims, *output_dims) a combined representation