evaluate

evaluate(model, mapped_triples, evaluator, only_size_probing=False, batch_size=None, slice_size=None, device=None, use_tqdm=True, tqdm_kwargs=None, restrict_entities_to=None, restrict_relations_to=None, do_time_consuming_checks=True, additional_filter_triples=None, pre_filtered_triples=True, targets=('head', 'tail'), *, mode)[source]

Evaluate metrics for model on mapped triples.

The model is used to predict scores for all tails and all heads for each triple. Subsequently, each abstract evaluator is applied to the scores, also receiving the batch itself (e.g. to compute entity-specific metrics). Thereby, the (potentially) expensive score computation against all entities is done only once. The metric evaluators are expected to maintain their own internal buffers. They are returned after running the evaluation, and should offer a possibility to extract some final metrics.

Parameters

model (Model) – The model to evaluate.
mapped_triples (LongTensor) – The triples on which to evaluate. The mapped triples should never contain inverse triples - these are created by the model class on the fly.
evaluator (Evaluator) – The evaluator.
only_size_probing (bool) – The evaluation is only performed for two batches to test the memory footprint, especially on GPUs.
batch_size (Optional[int]) – >0 A positive integer used as batch size. Generally chosen as large as possible. Defaults to 1 if None.
slice_size (Optional[int]) – >0 The divisor for the scoring function when using slicing.
device (Optional[device]) – The device on which the evaluation shall be run. If None is given, use the model’s device.
use_tqdm (bool) – Should a progress bar be displayed?
tqdm_kwargs (Optional[Mapping[str, str]]) – Additional keyword based arguments passed to the progress bar.
restrict_entities_to (Optional[Collection[int]]) – Optionally restrict the evaluation to the given entity IDs. This may be useful if one is only interested in a part of the entities, e.g. due to type constraints, but wants to train on all available data. For ranking the entities, we still compute all scores for all possible replacement entities to avoid irregular access patterns which might decrease performance, but the scores will afterwards be filtered to only keep those of interest. If provided, we assume by default that the triples are already filtered, such that it only contains the entities of interest. To explicitly filter within this method, pass pre_filtered_triples=False.
restrict_relations_to (Optional[Collection[int]]) – Optionally restrict the evaluation to the given relation IDs. This may be useful if one is only interested in a part of the relations, e.g. due to relation types, but wants to train on all available data. If provided, we assume by default that the triples are already filtered, such that it only contains the relations of interest. To explicitly filter within this method, pass pre_filtered_triples=False.
do_time_consuming_checks (bool) – Whether to perform some time consuming checks on the provided arguments. Currently, this encompasses: - If restrict_entities_to or restrict_relations_to is not None, check whether the triples have been filtered. Disabling this option can accelerate the method. Only effective if pre_filtered_triples is set to True.
pre_filtered_triples (bool) – Whether the triples have been pre-filtered to adhere to restrict_entities_to / restrict_relations_to. When set to True, and the triples have not been filtered, the results may be invalid. Pre-filtering the triples accelerates this method, and is recommended when evaluating multiple times on the same set of triples.
additional_filter_triples (Union[None, LongTensor, List[LongTensor]]) – additional true triples to filter out during filtered evaluation.
targets (Collection[Literal[‘head’, ‘relation’, ‘tail’]]) – the prediction targets
mode (Optional[Literal[‘training’, ‘validation’, ‘testing’]]) – the inductive mode, or None for transductive evaluation

Raises

NotImplementedError – if relation prediction evaluation is requested
ValueError – if the pre_filtered_triples contain unwanted entities (can only be detected with the time-consuming checks).

Return type

MetricResults

Returns

the evaluation results