DoubleMarginLoss
- class DoubleMarginLoss(*, positive_margin=None, negative_margin=None, offset=None, positive_negative_balance=0.5, margin_activation='relu', reduction='mean')[source]
Bases:
PointwiseLoss
A limit-based scoring loss, with separate margins for positive and negative elements from [sun2018].
Despite its similarity to the margin-based loss, this loss is quite different to it, since it uses absolute margins for positive/negative scores, rather than comparing the difference. Hence, it has a natural decision boundary (somewhere between the positive and negative margin), while still resulting in sparse losses with no gradients for sufficiently correct examples.
\[L(k, \bar{k}) = g(\bar{\lambda} + \bar{k}) + h(\lambda - k)\]Where \(k\) is positive scores, \(\bar{k}\) is negative scores, \(\lambda\) is the positive margin, \(\bar{\lambda}\) is the negative margin, and \(g\) is an activation function, like the ReLU or softmax.
Initialize the double margin loss.
Note
There are multiple variants to set the pair of margins. A full documentation is provided in
DoubleMarginLoss.resolve_margins()
.- Parameters:
positive_margin (
Optional
[float
]) – The (absolute) margin for the positive scores. Should be larger than the negative one.negative_margin (
Optional
[float
]) – The (absolute) margin for the negative scores. Should be smaller than the positive one.offset (
Optional
[float
]) – The offset between positive and negative margin. Must be non-negative.positive_negative_balance (
float
) – The balance between positive and negative term. Must be in (0, 1).margin_activation (
Union
[str
,Module
,None
]) – A margin activation. Defaults to'relu'
, i.e. \(h(\Delta) = max(0, \Delta + \lambda)\), which is the default “margin loss”. Using'softplus'
leads to a “soft-margin” formulation as discussed in https://arxiv.org/abs/1703.07737.reduction (
str
) – The name of the reduction operation to aggregate the individual loss values from a batch to a scalar loss value. From {‘mean’, ‘sum’}.
- Raises:
ValueError – If the positive/negative balance is not within the right range
Attributes Summary
The default strategy for optimizing the loss's hyper-parameters
Methods Summary
forward
(predictions, labels)Compute the double margin loss.
process_lcwa_scores
(predictions, labels[, ...])Process scores from LCWA training loop.
process_slcwa_scores
(positive_scores, ...[, ...])Process scores from sLCWA training loop.
resolve_margin
(positive_margin, ...)Resolve margins from multiple methods how to specify them.
Attributes Documentation
- hpo_default: ClassVar[Mapping[str, Any]] = {'margin_activation': {'choices': {'hard', 'relu', 'soft', 'softplus'}, 'type': 'categorical'}, 'offset': {'high': 1, 'low': 0, 'type': <class 'float'>}, 'positive_margin': {'high': 1, 'low': -1, 'type': <class 'float'>}, 'positive_negative_balance': {'high': 0.999, 'low': 0.001, 'type': <class 'float'>}}
The default strategy for optimizing the loss’s hyper-parameters
Methods Documentation
- forward(predictions, labels)[source]
Compute the double margin loss.
The scores have to be in broadcastable shape.
- Parameters:
predictions (
FloatTensor
) – The predicted scores.labels (
FloatTensor
) – The labels.
- Return type:
FloatTensor
- Returns:
A scalar loss term.
- process_lcwa_scores(predictions, labels, label_smoothing=None, num_entities=None)[source]
Process scores from LCWA training loop.
- Parameters:
predictions (
FloatTensor
) – shape: (batch_size, num_entities) The scores.labels (
FloatTensor
) – shape: (batch_size, num_entities) The labels.label_smoothing (
Optional
[float
]) – An optional label smoothing parameter.num_entities (
Optional
[int
]) – The number of entities (required for label-smoothing).
- Return type:
FloatTensor
- Returns:
A scalar loss value.
- process_slcwa_scores(positive_scores, negative_scores, label_smoothing=None, batch_filter=None, num_entities=None)[source]
Process scores from sLCWA training loop.
- Parameters:
positive_scores (
FloatTensor
) – shape: (batch_size, 1) The scores for positive triples.negative_scores (
FloatTensor
) – shape: (batch_size, num_neg_per_pos) or (num_unfiltered_negatives,) The scores for the negative triples, either in dense 2D shape, or in case they are already filtered, in sparse shape. If they are given in sparse shape, batch_filter needs to be provided, too.label_smoothing (
Optional
[float
]) – An optional label smoothing parameter.batch_filter (
Optional
[BoolTensor
]) – shape: (batch_size, num_neg_per_pos) An optional filter of negative scores which were kept. Given if and only if negative_scores have been pre-filtered.num_entities (
Optional
[int
]) – The number of entities. Only required if label smoothing is enabled.
- Return type:
FloatTensor
- Returns:
A scalar loss term.
- static resolve_margin(positive_margin, negative_margin, offset)[source]
Resolve margins from multiple methods how to specify them.
The method supports three combinations:
- positive_margin & negative_margin.
This returns the values as-is.
- negative_margin & offset
This sets positive_margin = negative_margin + offset
- positive_margin & offset
This sets negative_margin = positive_margin - offset
Note
Notice that this method does not apply a precedence between the three methods, but requires the remaining parameter to be None. This is done to fail fast on ambiguous input rather than delay a failure to a later point in time where it might be harder to find its cause.
- Parameters:
positive_margin (
Optional
[float
]) – The (absolute) margin for the positive scores. Should be larger than the negative one.negative_margin (
Optional
[float
]) – The (absolute) margin for the negative scores. Should be smaller than the positive one.offset (
Optional
[float
]) – The offset between positive and negative margin. Must be non-negative.
- Return type:
- Returns:
A pair of the positive and negative margin. Guaranteed to fulfil positive_margin >= negative_margin.
- Raises:
ValueError – In case of an invalid combination.