InfoNCELoss

class InfoNCELoss(margin: float = 0.02, log_adversarial_temperature: float = -2.995732273553991, reduction: Literal['mean', 'sum'] = 'mean')[source]

Bases: CrossEntropyLoss

The InfoNCE loss with additive margin proposed by [wang2022].

This loss is equivalent to CrossEntropyLoss, where the scores have been transformed:

positive scores are subtracted by the margin gamma and then divided by the temperature tau

\[f'(k) = \frac{f(k) - \gamma}{\tau}\]
negative scores are only divided by the temperature tau

\[f'(k^-) = \frac{f(k^-)}{\tau}\]

Initialize the loss.

Parameters:

margin (float) –
The loss’s margin (also written as \(\gamma\) in the reference paper)

Note

In the official implementation, the margin parameter only seems to be used during training. https://github.com/intfloat/SimKGC/blob/4388ebc0c0011fe333bc5a98d0613ab0d1825ddc/models.py#L92-L94
log_adversarial_temperature (float) –
The logarithm of the negative sampling temperature (also written as \(\tau\) in the reference paper). We follow the suggested parametrization which ensures positive temperatures for all hyperparameter values.

Note

The adversarial temperature is the inverse of the softmax temperature used when computing the weights! Its name is only kept for consistency with the nomenclature of [wang2022].

Note

In the official implementation, the temperature is a trainable parameter, cf. https://github.com/intfloat/SimKGC/blob/4388ebc0c0011fe333bc5a98d0613ab0d1825ddc/models.py#L31
reduction (Literal['mean', 'sum']) – The name of the reduction operation to aggregate the individual loss values from a batch to a scalar loss value. From {‘mean’, ‘sum’}.

Raises:

ValueError – if the margin is negative

Attributes Summary

`DEFAULT_LOG_ADVERSARIAL_TEMPERATURE`
`hpo_default`	The default strategy for optimizing the loss's hyper-parameters

Methods Summary

`process_lcwa_scores`(predictions, labels[, ...])	Process scores from LCWA training loop.
`process_slcwa_scores`(positive_scores, ...[, ...])	Process scores from sLCWA training loop.

Attributes Documentation

DEFAULT_LOG_ADVERSARIAL_TEMPERATURE: ClassVar[float] = -2.995732273553991

hpo_default: ClassVar[Mapping[str, Any]] = {'log_adversarial_temperature': {'high': 3.0, 'low': -3.0, 'type': <class 'float'>}, 'margin': {'high': 0.1, 'low': 0.01, 'type': <class 'float'>}}: The default strategy for optimizing the loss’s hyper-parameters

Methods Documentation

process_lcwa_scores(predictions: Tensor, labels: Tensor, label_smoothing: float | None = None, num_entities: int | None = None, weights: Tensor | None = None) → Tensor[source]

Process scores from LCWA training loop.

Parameters:

predictions (Tensor) – shape: (*shape) The scores.
labels (Tensor) – shape: (*shape) The labels.
label_smoothing (float | None) – An optional label smoothing parameter.
num_entities (int | None) – The number of entities (required for label-smoothing).
weights (Tensor | None) – shape: (*shape) Sample weights.

Returns:

A scalar loss value.

Return type:

Tensor

process_slcwa_scores(positive_scores: Tensor, negative_scores: Tensor, label_smoothing: float | None = None, batch_filter: Tensor | None = None, num_entities: int | None = None, pos_weights: Tensor | None = None, neg_weights: Tensor | None = None) → Tensor[source]

Process scores from sLCWA training loop.

Parameters:

positive_scores (Tensor) – shape: (batch_size, 1) The scores for positive triples.
negative_scores (Tensor) – shape: (batch_size, num_neg_per_pos) or (num_unfiltered_negatives,) The scores for the negative triples, either in dense 2D shape, or in case they are already filtered, in sparse shape. If they are given in sparse shape, batch_filter needs to be provided, too.
label_smoothing (float | None) – An optional label smoothing parameter.
batch_filter (Tensor | None) – shape: (batch_size, num_neg_per_pos) An optional filter of negative scores which were kept. Given if and only if negative_scores have been pre-filtered.
num_entities (int | None) – The number of entities. Only required if label smoothing is enabled.
pos_weights (Tensor | None) – shape: (batch_size, 1) Positive sample weights.
neg_weights (Tensor | None) – shape: (batch_size, num_neg_per_pos) Negative sample weights.

Returns:

A scalar loss term.

Return type:

Tensor