# DoubleMarginLoss

class DoubleMarginLoss(*, positive_margin=1.0, negative_margin=0.0, offset=None, positive_negative_balance=0.5, margin_activation='relu', reduction='mean')[source]

A limit-based scoring loss, with separate margins for positive and negative elements from [sun2018].

Despite its similarity to the margin-based loss, this loss is quite different to it, since it uses absolute margins for positive/negative scores, rather than comparing the difference. Hence, it has a natural decision boundary (somewhere between the positive and negative margin), while still resulting in sparse losses with no gradients for sufficiently correct examples.

$L(k, \bar{k}) = g(\bar{\lambda} + \bar{k}) + h(\lambda - k)$

Where $$k$$ is positive scores, $$\bar{k}$$ is negative scores, $$\lambda$$ is the positive margin, $$\bar{\lambda}$$ is the negative margin, and $$g$$ is an activation function, like the ReLU or softmax.

Initialize the double margin loss.

Note

There are multiple variants to set the pair of margins. A full documentation is provided in DoubleMarginLoss.resolve_margins().

Parameters
Raises

ValueError – If the positive/negative balance is not within the right range

Attributes Summary

 hpo_default The default strategy for optimizing the loss's hyper-parameters

Methods Summary

 forward(predictions, labels) Compute the double margin loss. process_lcwa_scores(predictions, labels[, ...]) Process scores from LCWA training loop. process_slcwa_scores(positive_scores, ...[, ...]) Process scores from sLCWA training loop. resolve_margin(positive_margin, ...) Resolve margins from multiple methods how to specify them.

Attributes Documentation

hpo_default: ClassVar[Mapping[str, Any]] = {'margin_activation': {'choices': {'relu', 'softplus'}, 'type': 'categorical'}, 'margin_positive': {'high': 1, 'low': -1, 'type': <class 'float'>}, 'offset': {'high': 1, 'low': 0, 'type': <class 'float'>}, 'positive_negative_balance': {'high': 0.999, 'low': 0.001, 'type': <class 'float'>}}

The default strategy for optimizing the loss’s hyper-parameters

Methods Documentation

forward(predictions, labels)[source]

Compute the double margin loss.

The scores have to be in broadcastable shape.

Parameters
• predictions (FloatTensor) – The predicted scores.

• labels (FloatTensor) – The labels.

Return type

FloatTensor

Returns

A scalar loss term.

process_lcwa_scores(predictions, labels, label_smoothing=None, num_entities=None)[source]

Process scores from LCWA training loop.

Parameters
Return type

FloatTensor

Returns

A scalar loss value.

process_slcwa_scores(positive_scores, negative_scores, label_smoothing=None, batch_filter=None, num_entities=None)[source]

Process scores from sLCWA training loop.

Parameters
• positive_scores (FloatTensor) – shape: (batch_size, 1) The scores for positive triples.

• negative_scores (FloatTensor) – shape: (batch_size, num_neg_per_pos) or (num_unfiltered_negatives,) The scores for the negative triples, either in dense 2D shape, or in case they are already filtered, in sparse shape. If they are given in sparse shape, batch_filter needs to be provided, too.

• label_smoothing (Optional[float]) – An optional label smoothing parameter.

• batch_filter (Optional[BoolTensor]) – shape: (batch_size, num_neg_per_pos) An optional filter of negative scores which were kept. Given if and only if negative_scores have been pre-filtered.

• num_entities (Optional[int]) – The number of entities. Only required if label smoothing is enabled.

Return type

FloatTensor

Returns

A scalar loss term.

static resolve_margin(positive_margin, negative_margin, offset)[source]

Resolve margins from multiple methods how to specify them.

The method supports three combinations:

• positive_margin & negative_margin.

This returns the values as-is.

• negative_margin & offset

This sets positive_margin = negative_margin + offset

• positive_margin & offset

This sets negative_margin = positive_margin - offset

Note

Notice that this method does not apply a precedence between the three methods, but requires the remaining parameter to be None. This is done to fail fast on ambiguous input rather than delay a failure to a later point in time where it might be harder to find its cause.

Parameters
Return type
Returns

A pair of the positive and negative margin. Guaranteed to fulfil positive_margin >= negative_margin.

Raises

ValueError – In case of an invalid combination.