RankBasedEvaluator

Bases: Evaluator[RankBasedMetricKey]

A rank-based evaluator for KGE models.

Initialize rank-based evaluator.

Parameters:

filtered (bool) – Whether to use the filtered evaluation protocol. If enabled, ranking another true triple higher than the currently considered one will not decrease the score.
metrics (OneOrManyHintOrType) – the rank-based metrics to compute
metrics_kwargs (OneOrManyOptionalKwargs) – additional keyword parameter
add_defaults (bool) – whether to add all default metrics besides the ones specified by metrics / metrics_kwargs.
clear_on_finalize (bool) –
whether to clear buffers on finalize call

Warning

disabling this option may lead to memory leaks and incorrect results when used from the pipeline
kwargs – Additional keyword arguments that are passed to the base class.

Methods Summary

`clear`()	Clear buffers and intermediate results.
`finalize`()	Compute the final results, and clear buffers.
`finalize_multi`([n_boot, seed])	Bootstrap from `finalize()`.
`finalize_with_confidence`(estimator, ...)	Finalize result with confidence estimation via bootstrapping.
`process_scores_`(hrt_batch, target, scores[, ...])	Process a batch of triples with their computed scores for all entities.

Methods Documentation

clear() → None[source]

Clear buffers and intermediate results.

Return type:: None

finalize() → RankBasedMetricResults[source]

Compute the final results, and clear buffers.

Return type:: RankBasedMetricResults

finalize_multi(n_boot: int = 1000, seed: int = 42) → Mapping[str, Sequence[float]][source]

Bootstrap from finalize().

Parameters:

n_boot (int) – the number of resampling steps
seed (int) – the random seed.

Returns:

a flat dictionary from metric names to list of values

Return type:

Mapping[str, Sequence[float]]

finalize_with_confidence(estimator: str | ~collections.abc.Callable[[~collections.abc.Sequence[float]], float] = <function median>, ci: int | str | ~collections.abc.Callable[[~collections.abc.Sequence[float]], float] = 90, n_boot: int = 1000, seed: int = 42) → Mapping[str, tuple[float, float]][source]

Finalize result with confidence estimation via bootstrapping.

Start by training a model (here, only for a one epochs)

>>> from pykeen.pipeline import pipeline
>>> result = pipeline(dataset="nations", model="rotate", training_kwargs=dict(num_epochs=1))

Create an evaluator with clear_on_finalize set to False, e.g., via

>>> from pykeen.evaluation import evaluator_resolver
>>> evaluator = evaluator_resolver.make("rankbased", clear_on_finalize=False)

Evaluate once, this time ignoring the result

>>> evaluator.evaluate(model=result.model, mapped_triples=result.training.mapped_triples)

Now, call finalize_with_confidence to obtain estimates for metrics together with confidence intervals

>>> evaluator.finalize_with_confidence(n_boot=10)

Parameters:

estimator (str | Callable[[Sequence[float]], float]) – the estimator of central tendency.
ci (int | str | Callable[[Sequence[float]], float]) – the confidence interval
n_boot (int) – the number of resamplings to use for bootstrapping
seed (int) – the random seed

Returns:

a dictionary from metric names to (central tendency, confidence) pairs

Return type:

Mapping[str, tuple[float, float]]

process_scores_(hrt_batch: Tensor, target: Literal['head', 'relation', 'tail'], scores: Tensor, true_scores: Tensor | None = None, dense_positive_mask: Tensor | None = None) → None[source]

Process a batch of triples with their computed scores for all entities.

Parameters:

hrt_batch (Tensor) – shape: (batch_size, 3)
target (Literal['head', 'relation', 'tail']) – the prediction target
scores (Tensor) – shape: (batch_size, num_entities)
true_scores (Tensor | None) – shape: (batch_size, 1)
dense_positive_mask (Tensor | None) – shape: (batch_size, num_entities) An optional binary (0/1) tensor indicating other true entities.

Return type:

None