GatedCombination

class GatedCombination(entity_dim=32, literal_dim=None, input_dropout=0.0, gate_activation=<class 'torch.nn.modules.activation.Sigmoid'>, gate_activation_kwargs=None, hidden_activation=<class 'torch.nn.modules.activation.Tanh'>, hidden_activation_kwargs=None)[source]

Bases: Combination

A module that implements a gated linear transformation for the combination of entities and literals.

Compared to the other Combinations, this combination makes use of a gating mechanism commonly found in RNNs. The main goal of this gating mechanism is to learn which parts of the additional literal information is useful or not and act accordingly, by incorporating them into the new combined embedding or discarding them.

For given entity representation \(\mathbf{x}_e \in \mathbb{R}^{d_e}\) and literal representation \(\mathbf{x}_l \in \mathbb{R}^{d_l}\), the module calculates

\[z = f_{gate}(\mathbf{W}_e x_e + \mathbf{W}_l x_l + \mathbf{b}) h = f_{hidden}(\mathbf{W} [x_e; x_l]) y = Dropout(z \odot h + (1 - z) \odot x)\]

where \(\mathbf{W}_e \in \mathbb{R}^{d_e \times d_e}\),:math:mathbf{W}_l in mathbb{R}^{d_l times d_e}, \(\mathbf{W} \in \mathbb{R}^{(d_e + d_l) \ times d_e}\), and \(\mathbf{b} \in \mathbb{R}^{d_e}\) are trainable parameters, \(f_{gate}\) and \(f_{hidden}\) are activation functions, defaulting to sigmoid and tanh, \(\odot\) denotes the element-wise multiplication, and \([x_e; x_l]\) the concatenation operation.

Note

We can alternatively express the gate

\[z = f_{gate}(\mathbf{W}_e x_e + \mathbf{W}_l x_l + \mathbf{b})\]

as

\[z = f_{gate}(\mathbf{W}_{el} [x_e; x_l] + \mathbf{b})\]

with \(\mathbf{W}_{el} \in \mathbb{R}^{(d_e + d_l) \times d_e}\).

Implementation based on https://github.com/SmartDataAnalytics/LiteralE/blob/master/model.py Gate class.

Instantiate the module.

Parameters:

entity_dim (int) – the dimension of the entity representations.
literal_dim (Optional[int]) – the dimension of the literals; defaults to entity_dim
input_dropout (float) – the dropout to use
gate_activation (Union[str, Module, Type[Module], None]) – the activation to use on the gate, or a hint thereof
gate_activation_kwargs (Optional[Mapping[str, Any]]) – the keyword arguments to be used to instantiate the gate_activation if a class or name is given instead of a pre-instantiated activation module
hidden_activation (Union[str, Module, Type[Module], None]) – the activation to use in the hidden layer, or a hint thereof
hidden_activation_kwargs (Optional[Mapping[str, Any]]) – the keyword arguments to be used to instantiate the hidden activation if a class or name is given instead of a pre-instantiated activation module

Methods Summary

forward(xs)

Combine a sequence of individual representations.

Methods Documentation

forward(xs)[source]

Combine a sequence of individual representations.

Parameters:: xs (Sequence[FloatTensor]) – shape: (*batch_dims, *input_dims_i) the individual representations
Return type:: FloatTensor
Returns:: shape: (*batch_dims, *output_dims) a combined representation