DoubleMarginLoss

class DoubleMarginLoss(*, positive_margin: float | None = None, negative_margin: float | None = None, offset: float | None = None, positive_negative_balance: float = 0.5, margin_activation: str | Module | None = 'relu', reduction: Literal['mean', 'sum'] = 'mean')[source]

Bases: PointwiseLoss

A limit-based scoring loss, with separate margins for positive and negative elements from [sun2018].

Despite its similarity to the margin-based loss, this loss is quite different to it, since it uses absolute margins for positive/negative scores, rather than comparing the difference. Hence, it has a natural decision boundary (somewhere between the positive and negative margin), while still resulting in sparse losses with no gradients for sufficiently correct examples.

\[L(k, \bar{k}) = g(\bar{\lambda} + \bar{k}) + h(\lambda - k)\]

Where \(k\) is positive scores, \(\bar{k}\) is negative scores, \(\lambda\) is the positive margin, \(\bar{\lambda}\) is the negative margin, and \(g\) is an activation function, like the ReLU or softmax.

Initialize the double margin loss.

Note

There are multiple variants to set the pair of margins. A full documentation is provided in DoubleMarginLoss.resolve_margins().

Parameters:

positive_margin (float | None) – The (absolute) margin for the positive scores. Should be larger than the negative one.
negative_margin (float | None) – The (absolute) margin for the negative scores. Should be smaller than the positive one.
offset (float | None) – The offset between positive and negative margin. Must be non-negative.
positive_negative_balance (float) – The balance between positive and negative term. Must be in (0, 1).
margin_activation (Hint[nn.Module]) – A margin activation. Defaults to 'relu', i.e. \(h(\Delta) = max(0, \Delta + \lambda)\), which is the default “margin loss”. Using 'softplus' leads to a “soft-margin” formulation as discussed in https://arxiv.org/abs/1703.07737.
reduction (Literal['mean', 'sum']) – The name of the reduction operation to aggregate the individual loss values from a batch to a scalar loss value. From {‘mean’, ‘sum’}.

Raises:

ValueError – If the positive/negative balance is not within the right range

Attributes Summary

hpo_default

The default strategy for optimizing the loss's hyper-parameters

Methods Summary

`forward`(x, target[, weight])	Calculate the point-wise loss.
`process_lcwa_scores`(predictions, labels[, ...])	Process scores from LCWA training loop.
`process_slcwa_scores`(positive_scores, ...[, ...])	Process scores from sLCWA training loop.
`resolve_margin`(positive_margin, ...)	Resolve margins from multiple methods how to specify them.

Attributes Documentation

hpo_default: ClassVar[Mapping[str, Any]] = {'margin_activation': {'choices': {'hard', 'relu', 'soft', 'softplus'}, 'type': 'categorical'}, 'offset': {'high': 1, 'low': 0, 'type': <class 'float'>}, 'positive_margin': {'high': 1, 'low': -1, 'type': <class 'float'>}, 'positive_negative_balance': {'high': 0.999, 'low': 0.001, 'type': <class 'float'>}}: The default strategy for optimizing the loss’s hyper-parameters

Methods Documentation

forward(x: Tensor, target: Tensor, weight: Tensor | None = None) → Tensor[source]

Calculate the point-wise loss.

Parameters:

x (Tensor) – The predictions.
target (Tensor) – The target values (between 0 and 1).
weight (Tensor | None) – The sample weights.

Returns:

The scalar loss value.

Return type:

Tensor

process_lcwa_scores(predictions: Tensor, labels: Tensor, label_smoothing: float | None = None, num_entities: int | None = None, weights: Tensor | None = None) → Tensor[source]

Process scores from LCWA training loop.

Parameters:

predictions (Tensor) – shape: (*shape) The scores.
labels (Tensor) – shape: (*shape) The labels.
label_smoothing (float | None) – An optional label smoothing parameter.
num_entities (int | None) – The number of entities (required for label-smoothing).
weights (Tensor | None) – shape: (*shape) Sample weights.

Returns:

A scalar loss value.

Return type:

Tensor

process_slcwa_scores(positive_scores: Tensor, negative_scores: Tensor, label_smoothing: float | None = None, batch_filter: Tensor | None = None, num_entities: int | None = None, pos_weights: Tensor | None = None, neg_weights: Tensor | None = None) → Tensor[source]

Process scores from sLCWA training loop.

Parameters:

positive_scores (Tensor) – shape: (batch_size, 1) The scores for positive triples.
negative_scores (Tensor) – shape: (batch_size, num_neg_per_pos) or (num_unfiltered_negatives,) The scores for the negative triples, either in dense 2D shape, or in case they are already filtered, in sparse shape. If they are given in sparse shape, batch_filter needs to be provided, too.
label_smoothing (float | None) – An optional label smoothing parameter.
batch_filter (Tensor | None) – shape: (batch_size, num_neg_per_pos) An optional filter of negative scores which were kept. Given if and only if negative_scores have been pre-filtered.
num_entities (int | None) – The number of entities. Only required if label smoothing is enabled.
pos_weights (Tensor | None) – shape: (batch_size, 1) Positive sample weights.
neg_weights (Tensor | None) – shape: (batch_size, num_neg_per_pos) Negative sample weights.

Returns:

A scalar loss term.

Return type:

Tensor

static resolve_margin(positive_margin: float | None, negative_margin: float | None, offset: float | None) → tuple[float, float][source]

Resolve margins from multiple methods how to specify them.

The method supports three combinations:

positive_margin & negative_margin.
This returns the values as-is.
negative_margin & offset
This sets positive_margin = negative_margin + offset
positive_margin & offset
This sets negative_margin = positive_margin - offset

Note

Notice that this method does not apply a precedence between the three methods, but requires the remaining parameter to be None. This is done to fail fast on ambiguous input rather than delay a failure to a later point in time where it might be harder to find its cause.

Parameters:

positive_margin (float | None) – The (absolute) margin for the positive scores. Should be larger than the negative one.
negative_margin (float | None) – The (absolute) margin for the negative scores. Should be smaller than the positive one.
offset (float | None) – The offset between positive and negative margin. Must be non-negative.

Returns:

A pair of the positive and negative margin. Guaranteed to fulfil positive_margin >= negative_margin.

Raises:

ValueError – In case of an invalid combination.

Return type:

tuple[float, float]