- class RankBasedEvaluator(filtered: bool = True, metrics: str | X | type[X] | None | Sequence[str | X | type[X] | None] = None, metrics_kwargs: Mapping[str, Any] | None | Sequence[Mapping[str, Any] | None] = None, add_defaults: bool = True, clear_on_finalize: bool = True, **kwargs)[source]
]A rank-based evaluator for KGE models.
Initialize rank-based evaluator.
- Parameters:
filtered (bool) – Whether to use the filtered evaluation protocol. If enabled, ranking another true triple higher than the currently considered one will not decrease the score.
metrics (OneOrManyHintOrType) – the rank-based metrics to compute
metrics_kwargs (OneOrManyOptionalKwargs) – additional keyword parameter
add_defaults (bool) – whether to add all default metrics besides the ones specified by metrics / metrics_kwargs.
clear_on_finalize (bool) –
whether to clear buffers on finalize call
disabling this option may lead to memory leaks and incorrect results when used from the pipeline
kwargs – Additional keyword arguments that are passed to the base class.
Methods Summary
()Clear buffers and intermediate results.
()Compute the final results, and clear buffers.
([n_boot, seed])Bootstrap from
([estimator, ci, ...])Finalize result with confidence estimation via bootstrapping.
(hrt_batch, target, scores[, ...])Process a batch of triples with their computed scores for all entities.
Methods Documentation
- finalize() RankBasedMetricResults [source]
Compute the final results, and clear buffers.
- Return type:
- finalize_multi(n_boot: int = 1000, seed: int = 42) Mapping[str, Sequence[float]] [source]
Bootstrap from
- finalize_with_confidence(estimator: str | ~typing.Callable[[~collections.abc.Sequence[float]], float] = <function median>, ci: int | str | ~typing.Callable[[~collections.abc.Sequence[float]], float] = 90, n_boot: int = 1000, seed: int = 42) Mapping[str, tuple[float, float]] [source]
Finalize result with confidence estimation via bootstrapping.
Start by training a model (here, only for a one epochs)
>>> from pykeen.pipeline import pipeline >>> result = pipeline(dataset="nations", model="rotate", training_kwargs=dict(num_epochs=1))
Create an evaluator with clear_on_finalize set to False, e.g., via
>>> from pykeen.evaluation import evaluator_resolver >>> evaluator = evaluator_resolver.make("rankbased", clear_on_finalize=False)
Evaluate once, this time ignoring the result
>>> evaluator.evaluate(model=result.model, mapped_triples=result.training.mapped_triples)
Now, call finalize_with_confidence to obtain estimates for metrics together with confidence intervals
>>> evaluator.finalize_with_confidence(n_boot=10)
- Parameters:
- Returns:
a dictionary from metric names to (central tendency, confidence) pairs
- Return type:
- process_scores_(hrt_batch: Tensor, target: Literal['head', 'relation', 'tail'], scores: Tensor, true_scores: Tensor | None = None, dense_positive_mask: Tensor | None = None) None [source]
Process a batch of triples with their computed scores for all entities.
- Parameters:
hrt_batch (Tensor) – shape: (batch_size, 3)
target (Literal['head', 'relation', 'tail']) – the prediction target
scores (Tensor) – shape: (batch_size, num_entities)
true_scores (Tensor | None) – shape: (batch_size, 1)
dense_positive_mask (Tensor | None) – shape: (batch_size, num_entities) An optional binary (0/1) tensor indicating other true entities.
- Return type: