Evaluator

class Evaluator(filtered=False, requires_positive_mask=False, batch_size=None, slice_size=None, automatic_memory_optimization=True, mode=None)[source]

Bases: ABC

An abstract evaluator for KGE models.

The evaluator encapsulates the computation of evaluation metrics based on head and tail scores. To this end, it offers two methods to process a batch of triples together with the scores produced by some model. It maintains intermediate results in its state, and offers a method to obtain the final results once finished.

Initialize the evaluator.

Parameters:

filtered (bool) – Should filtered evaluation be performed?
requires_positive_mask (bool) – Does the evaluator need access to the masks?
batch_size (Optional[int]) – >0. Evaluation batch size.
slice_size (Optional[int]) – >0. The divisor for the scoring function when using slicing
automatic_memory_optimization (bool) – Whether to automatically optimize the sub-batch size during evaluation with regards to the hardware at hand.
mode (Optional[Literal[‘training’, ‘validation’, ‘testing’]]) – the inductive mode, or None for transductive evaluation

Methods Summary

`batch_and_slice`(model, mapped_triples[, ...])	Find the maximum possible batch_size and slice_size for evaluation with the current setting.
`clear`()	Clear buffers and intermediate results.
`evaluate`(model, mapped_triples[, ...])	Run `pykeen.evaluation.evaluate()` with this evaluator.
`finalize`()	Compute the final results, and clear buffers.
`get_normalized_name`()	Get the normalized name of the evaluator.
`process_scores_`(hrt_batch, target, scores[, ...])	Process a batch of triples with their computed scores for all entities.

Methods Documentation

batch_and_slice(model, mapped_triples, batch_size=None, **kwargs)[source]

Find the maximum possible batch_size and slice_size for evaluation with the current setting.

The speed of evaluation can be greatly increased when the batch_size is increased, therefore this function estimates the maximal possible batch_size for the evaluation by starting with the batch_size given as argument and increasing it until the hardware runs out-of-memory(OOM). In some cases, i.e. with very large models or very large datasets, even the batch_size 1 is too big for the hardware at hand. In these cases, this function will check if the model at hand allows slicing (this needs to be implemented for the affected scoring functions) and, if possible, will search the maximum possible slice_size that would still allow to calculate the model with the given parameters on the hardware at hand.

Parameters:

model (Model) – The model to evaluate.
mapped_triples (LongTensor) – The triples on which to evaluate.
batch_size (Optional[int]) – The initial batch size to start with. None defaults to number_of_triples.
kwargs – additional keyword-based parameters passed to pykeen.evaluation.evaluate()

Return type:

Tuple[int, Optional[int]]

Returns:

Maximum possible batch size and, if necessary, the slice_size, which defaults to None.

Raises:

MemoryError – If it is not possible to evaluate the model on the hardware at hand with the given parameters.

abstract clear()[source]

Clear buffers and intermediate results.

Return type:: None

evaluate(model, mapped_triples, batch_size=None, slice_size=None, **kwargs)[source]

Run pykeen.evaluation.evaluate() with this evaluator.

This method will re-use the stored optimized batch and slice size, as well as the evaluator’s inductive mode.

Parameters:

model (Model) – the model to evaluate.
mapped_triples (LongTensor) – shape: (n, 3) the ID-based evaluation triples
batch_size (Optional[int]) – the batch size to use, or None to trigger automatic memory optimization
slice_size (Optional[int]) – the slice size to use
kwargs – the keyword-based parameters passed to pykeen.evaluation.evaluate()

Return type:

MetricResults

Returns:

the evaluation results

abstract finalize()[source]

Compute the final results, and clear buffers.

Return type:: MetricResults

classmethod get_normalized_name()[source]

Get the normalized name of the evaluator.

Return type:: str

abstract process_scores_(hrt_batch, target, scores, true_scores=None, dense_positive_mask=None)[source]

Process a batch of triples with their computed scores for all entities.

Parameters:

hrt_batch (LongTensor) – shape: (batch_size, 3)
target (Literal[‘head’, ‘relation’, ‘tail’]) – the prediction target
scores (FloatTensor) – shape: (batch_size, num_entities)
true_scores (Optional[FloatTensor]) – shape: (batch_size, 1)
dense_positive_mask (Optional[FloatTensor]) – shape: (batch_size, num_entities) An optional binary (0/1) tensor indicating other true entities.

Return type:

None