RankBasedEvaluator

class RankBasedEvaluator(filtered: bool = True, metrics: str | X | type[X] | None | Sequence[str | X | type[X] | None] = None, metrics_kwargs: Mapping[str, Any] | None | Sequence[Mapping[str, Any] | None] = None, add_defaults: bool = True, clear_on_finalize: bool = True, **kwargs)[source]

Bases: Evaluator[RankBasedMetricKey]

A rank-based evaluator for KGE models.

Initialize rank-based evaluator.

Parameters:
  • filtered (bool) – Whether to use the filtered evaluation protocol. If enabled, ranking another true triple higher than the currently considered one will not decrease the score.

  • metrics (OneOrManyHintOrType) – the rank-based metrics to compute

  • metrics_kwargs (OneOrManyOptionalKwargs) – additional keyword parameter

  • add_defaults (bool) – whether to add all default metrics besides the ones specified by metrics / metrics_kwargs.

  • clear_on_finalize (bool) –

    whether to clear buffers on finalize call

    Warning

    disabling this option may lead to memory leaks and incorrect results when used from the pipeline

  • kwargs – Additional keyword arguments that are passed to the base class.

Methods Summary

clear()

Clear buffers and intermediate results.

finalize()

Compute the final results, and clear buffers.

finalize_multi([n_boot, seed])

Bootstrap from finalize().

finalize_with_confidence([estimator, ci, ...])

Finalize result with confidence estimation via bootstrapping.

process_scores_(hrt_batch, target, scores[, ...])

Process a batch of triples with their computed scores for all entities.

Methods Documentation

clear() None[source]

Clear buffers and intermediate results.

Return type:

None

finalize() RankBasedMetricResults[source]

Compute the final results, and clear buffers.

Return type:

RankBasedMetricResults

finalize_multi(n_boot: int = 1000, seed: int = 42) Mapping[str, Sequence[float]][source]

Bootstrap from finalize().

Parameters:
  • n_boot (int) – the number of resampling steps

  • seed (int) – the random seed.

Returns:

a flat dictionary from metric names to list of values

Return type:

Mapping[str, Sequence[float]]

finalize_with_confidence(estimator: str | ~typing.Callable[[~collections.abc.Sequence[float]], float] = <function median>, ci: int | str | ~typing.Callable[[~collections.abc.Sequence[float]], float] = 90, n_boot: int = 1000, seed: int = 42) Mapping[str, tuple[float, float]][source]

Finalize result with confidence estimation via bootstrapping.

Start by training a model (here, only for a one epochs)

>>> from pykeen.pipeline import pipeline
>>> result = pipeline(dataset="nations", model="rotate", training_kwargs=dict(num_epochs=1))

Create an evaluator with clear_on_finalize set to False, e.g., via

>>> from pykeen.evaluation import evaluator_resolver
>>> evaluator = evaluator_resolver.make("rankbased", clear_on_finalize=False)

Evaluate once, this time ignoring the result

>>> evaluator.evaluate(model=result.model, mapped_triples=result.training.mapped_triples)

Now, call finalize_with_confidence to obtain estimates for metrics together with confidence intervals

>>> evaluator.finalize_with_confidence(n_boot=10)
Parameters:
Returns:

a dictionary from metric names to (central tendency, confidence) pairs

Return type:

Mapping[str, tuple[float, float]]

process_scores_(hrt_batch: Tensor, target: Literal['head', 'relation', 'tail'], scores: Tensor, true_scores: Tensor | None = None, dense_positive_mask: Tensor | None = None) None[source]

Process a batch of triples with their computed scores for all entities.

Parameters:
  • hrt_batch (Tensor) – shape: (batch_size, 3)

  • target (Literal['head', 'relation', 'tail']) – the prediction target

  • scores (Tensor) – shape: (batch_size, num_entities)

  • true_scores (Tensor | None) – shape: (batch_size, 1)

  • dense_positive_mask (Tensor | None) – shape: (batch_size, num_entities) An optional binary (0/1) tensor indicating other true entities.

Return type:

None