InfoNCELoss
- class InfoNCELoss(margin=0.02, log_adversarial_temperature=-2.995732273553991, reduction='mean')[source]
Bases:
CrossEntropyLoss
The InfoNCE loss with additive margin proposed by [wang2022].
This loss is equivalent to
CrossEntropyLoss
, where the scores have been transformed:positive scores are subtracted by the margin gamma and then divided by the temperature tau
\[f'(k) = \frac{f(k) - \gamma}{\tau}\]negative scores are only divided by the temperature tau
\[f'(k^-) = \frac{f(k^-)}{\tau}\]
Initialize the loss.
- Parameters:
margin (
float
) –The loss’s margin (also written as \(\gamma\) in the reference paper)
Note
In the official implementation, the margin parameter only seems to be used during training. https://github.com/intfloat/SimKGC/blob/4388ebc0c0011fe333bc5a98d0613ab0d1825ddc/models.py#L92-L94
log_adversarial_temperature (
float
) –The logarithm of the negative sampling temperature (also written as \(\tau\) in the reference paper). We follow the suggested parametrization which ensures positive temperatures for all hyperparameter values.
Note
The adversarial temperature is the inverse of the softmax temperature used when computing the weights! Its name is only kept for consistency with the nomenclature of [wang2022].
Note
In the official implementation, the temperature is a trainable parameter, cf. https://github.com/intfloat/SimKGC/blob/4388ebc0c0011fe333bc5a98d0613ab0d1825ddc/models.py#L31
reduction (
str
) – The name of the reduction operation to aggregate the individual loss values from a batch to a scalar loss value. From {‘mean’, ‘sum’}.
- Raises:
ValueError – if the margin is negative
Attributes Summary
The default strategy for optimizing the loss's hyper-parameters
Methods Summary
process_lcwa_scores
(predictions, labels[, ...])Process scores from LCWA training loop.
process_slcwa_scores
(positive_scores, ...[, ...])Process scores from sLCWA training loop.
Attributes Documentation
- hpo_default: ClassVar[Mapping[str, Any]] = {'log_adversarial_temperature': {'high': 3.0, 'low': -3.0, 'type': <class 'float'>}, 'margin': {'high': 0.1, 'low': 0.01, 'type': <class 'float'>}}
The default strategy for optimizing the loss’s hyper-parameters
Methods Documentation
- process_lcwa_scores(predictions, labels, label_smoothing=None, num_entities=None)[source]
Process scores from LCWA training loop.
- Parameters:
predictions (
FloatTensor
) – shape: (batch_size, num_entities) The scores.labels (
FloatTensor
) – shape: (batch_size, num_entities) The labels.label_smoothing (
Optional
[float
]) – An optional label smoothing parameter.num_entities (
Optional
[int
]) – The number of entities (required for label-smoothing).
- Return type:
FloatTensor
- Returns:
A scalar loss value.
- process_slcwa_scores(positive_scores, negative_scores, label_smoothing=None, batch_filter=None, num_entities=None)[source]
Process scores from sLCWA training loop.
- Parameters:
positive_scores (
FloatTensor
) – shape: (batch_size, 1) The scores for positive triples.negative_scores (
FloatTensor
) – shape: (batch_size, num_neg_per_pos) or (num_unfiltered_negatives,) The scores for the negative triples, either in dense 2D shape, or in case they are already filtered, in sparse shape. If they are given in sparse shape, batch_filter needs to be provided, too.label_smoothing (
Optional
[float
]) – An optional label smoothing parameter.batch_filter (
Optional
[BoolTensor
]) – shape: (batch_size, num_neg_per_pos) An optional filter of negative scores which were kept. Given if and only if negative_scores have been pre-filtered.num_entities (
Optional
[int
]) – The number of entities. Only required if label smoothing is enabled.
- Return type:
FloatTensor
- Returns:
A scalar loss term.