Evaluator

class Evaluator(filtered: bool = False, requires_positive_mask: bool = False, batch_size: int | None = None, slice_size: int | None = None, mode: Literal['training', 'validation', 'testing'] | None = None)[source]

Bases: ABC, Generic[MetricKeyType]

An abstract evaluator for KGE models.

The evaluator encapsulates the computation of evaluation metrics based on head and tail scores. To this end, it offers two methods to process a batch of triples together with the scores produced by some model. It maintains intermediate results in its state, and offers a method to obtain the final results once finished.

Initialize the evaluator.

Parameters:

filtered (bool) – Should filtered evaluation be performed?
requires_positive_mask (bool) – Does the evaluator need access to the masks?
batch_size (int | None) – >0. Evaluation batch size.
slice_size (int | None) – >0. The divisor for the scoring function when using slicing
mode (InductiveMode | None) – the inductive mode, or None for transductive evaluation

Methods Summary

`clear`()	Clear buffers and intermediate results.
`evaluate`(model, mapped_triples[, ...])	Evaluate metrics for model on mapped triples.
`finalize`()	Compute the final results, and clear buffers.
`get_normalized_name`()	Get the normalized name of the evaluator.
`process_scores_`(hrt_batch, target, scores[, ...])	Process a batch of triples with their computed scores for all entities.

Methods Documentation

abstractmethod clear() → None[source]

Clear buffers and intermediate results.

Return type:: None

evaluate(model: Model, mapped_triples: Tensor, batch_size: int | None = None, slice_size: int | None = None, device: device | None = None, use_tqdm: bool = True, tqdm_kwargs: Mapping[str, Any] | None = None, restrict_entities_to: Collection[int] | None = None, restrict_relations_to: Collection[int] | None = None, do_time_consuming_checks: bool = True, additional_filter_triples: None | Tensor | list[Tensor] = None, pre_filtered_triples: bool = True, targets: Collection[Literal['head', 'relation', 'tail']] = ('head', 'tail')) → MetricResults[MetricKeyType][source]

Evaluate metrics for model on mapped triples.

Parameters:

model (Model) – The model to evaluate.
mapped_triples (Tensor) – The triples on which to evaluate. The mapped triples should never contain inverse triples - these are created by the model class on the fly.
batch_size (int | None) – >0 A positive integer used as batch size. Generally chosen as large as possible. Defaults to 1 if None.
slice_size (int | None) – >0 The divisor for the scoring function when using slicing.
device (device | None) – The device on which the evaluation shall be run. If None is given, use the model’s device.
use_tqdm (bool) – Should a progress bar be displayed?
tqdm_kwargs (Mapping[str, Any] | None) – Additional keyword based arguments passed to the progress bar.
restrict_entities_to (Collection[int] | None) – Optionally restrict the evaluation to the given entity IDs. This may be useful if one is only interested in a part of the entities, e.g. due to type constraints, but wants to train on all available data. For ranking the entities, we still compute all scores for all possible replacement entities to avoid irregular access patterns which might decrease performance, but the scores will afterward be filtered to only keep those of interest. If provided, we assume by default that the triples are already filtered, such that it only contains the entities of interest. To explicitly filter within this method, pass pre_filtered_triples=False.
restrict_relations_to (Collection[int] | None) – Optionally restrict the evaluation to the given relation IDs. This may be useful if one is only interested in a part of the relations, e.g. due to relation types, but wants to train on all available data. If provided, we assume by default that the triples are already filtered, such that it only contains the relations of interest. To explicitly filter within this method, pass pre_filtered_triples=False.
do_time_consuming_checks (bool) – Whether to perform some time-consuming checks on the provided arguments. Currently, this encompasses only: If restrict_entities_to or restrict_relations_to is not None, check whether the triples have been filtered. Disabling this option can accelerate the method. Only effective if pre_filtered_triples is set to True.
pre_filtered_triples (bool) – Whether the triples have been pre-filtered to adhere to restrict_entities_to / restrict_relations_to. When set to True, and the triples have not been filtered, the results may be invalid. Pre-filtering the triples accelerates this method, and is recommended when evaluating multiple times on the same set of triples.
additional_filter_triples (None | Tensor | list[Tensor]) – additional true triples to filter out during filtered evaluation.
targets (Collection[Literal['head', 'relation', 'tail']]) – the prediction targets

Returns:

the evaluation results

Raises:

NotImplementedError – if relation prediction evaluation is requested
ValueError – if the pre_filtered_triples contain unwanted entities (can only be detected with the time-consuming checks).
MemoryError – if the evaluation fails on cpu

Return type:

MetricResults[MetricKeyType]

abstractmethod finalize() → MetricResults[MetricKeyType][source]

Compute the final results, and clear buffers.

Return type:: MetricResults[MetricKeyType]

classmethod get_normalized_name() → str[source]

Get the normalized name of the evaluator.

Return type:: str

abstractmethod process_scores_(hrt_batch: Tensor, target: Literal['head', 'relation', 'tail'], scores: Tensor, true_scores: Tensor | None = None, dense_positive_mask: Tensor | None = None) → None[source]

Process a batch of triples with their computed scores for all entities.

Parameters:

hrt_batch (Tensor) – shape: (batch_size, 3)
target (Literal['head', 'relation', 'tail']) – the prediction target
scores (Tensor) – shape: (batch_size, num_entities)
true_scores (Tensor | None) – shape: (batch_size, 1)
dense_positive_mask (Tensor | None) – shape: (batch_size, num_entities) An optional binary (0/1) tensor indicating other true entities.

Return type:

None