RankBasedEvaluator

class RankBasedEvaluator(filtered=True, metrics=None, metrics_kwargs=None, add_defaults=True, clear_on_finalize=True, **kwargs)[source]

Bases: Evaluator

A rank-based evaluator for KGE models.

Initialize rank-based evaluator.

Parameters
  • filtered (bool) – Whether to use the filtered evaluation protocol. If enabled, ranking another true triple higher than the currently considered one will not decrease the score.

  • metrics (Optional[Sequence[Union[str, RankBasedMetric, Type[RankBasedMetric], None]]]) – the rank-based metrics to compute

  • metrics_kwargs (Optional[Mapping[str, Any]]) – additional keyword parameter

  • add_defaults (bool) – whether to add all default metrics besides the ones specified by metrics / metrics_kwargs.

  • clear_on_finalize (bool) –

    whether to clear buffers on finalize call

    Warning

    disabling this option may lead to memory leaks and incorrect results when used from the pipeline

  • kwargs – Additional keyword arguments that are passed to the base class.

Methods Summary

finalize()

Compute the final results, and clear buffers.

finalize_multi([n_boot, seed])

Bootstrap from finalize().

finalize_with_confidence([estimator, ci, ...])

Finalize result with confidence estimation via bootstrapping.

process_scores_(hrt_batch, target, scores[, ...])

Process a batch of triples with their computed scores for all entities.

Methods Documentation

finalize()[source]

Compute the final results, and clear buffers.

Return type

RankBasedMetricResults

finalize_multi(n_boot=1000, seed=42)[source]

Bootstrap from finalize().

Parameters
  • n_boot (int) – the number of resampling steps

  • seed (int) – the random seed.

Return type

Mapping[str, Sequence[float]]

Returns

a flat dictionary from metric names to list of values

finalize_with_confidence(estimator=<function median>, ci=90, n_boot=1000, seed=42)[source]

Finalize result with confidence estimation via bootstrapping.

Start by training a model (here, only for a one epochs)

>>> from pykeen.pipeline import pipeline
>>> result = pipeline(dataset="nations", model="rotate", training_kwargs=dict(num_epochs=1))

Create an evaluator with clear_on_finalize set to False, e.g., via

>>> from pykeen.evaluation import evaluator_resolver
>>> evaluator = evaluator_resolver.make("rankbased", clear_on_finalize=False)

Evaluate once, this time ignoring the result

>>> evaluator.evaluate(model=result.model, mapped_triples=result.training.mapped_triples)

Now, call finalize_with_confidence to obtain estimates for metrics together with confidence intervals

>>> evaluator.finalize_with_confidence(n_boot=10)
Parameters
Return type

Mapping[str, Tuple[float, float]]

Returns

a dictionary from metric names to (central tendency, confidence) pairs

process_scores_(hrt_batch, target, scores, true_scores=None, dense_positive_mask=None)[source]

Process a batch of triples with their computed scores for all entities.

Parameters
  • hrt_batch (LongTensor) – shape: (batch_size, 3)

  • target (Literal[‘head’, ‘relation’, ‘tail’]) – the prediction target

  • scores (FloatTensor) – shape: (batch_size, num_entities)

  • true_scores (Optional[FloatTensor]) – shape: (batch_size, 1)

  • dense_positive_mask (Optional[FloatTensor]) – shape: (batch_size, num_entities) An optional binary (0/1) tensor indicating other true entities.

Return type

None