evaluate

evaluate(model, mapped_triples, evaluators, only_size_probing=False, batch_size=None, slice_size=None, device=None, squeeze=True, use_tqdm=True, tqdm_kwargs=None, restrict_entities_to=None, do_time_consuming_checks=True, additional_filtered_triples=None)[source]

Evaluate metrics for model on mapped triples.

The model is used to predict scores for all tails and all heads for each triple. Subsequently, each abstract evaluator is applied to the scores, also receiving the batch itself (e.g. to compute entity-specific metrics). Thereby, the (potentially) expensive score computation against all entities is done only once. The metric evaluators are expected to maintain their own internal buffers. They are returned after running the evaluation, and should offer a possibility to extract some final metrics.

Parameters
  • model (Model) – The model to evaluate.

  • mapped_triples (LongTensor) – The triples on which to evaluate. The mapped triples should never contain inverse triples - these are created by the model class on the fly.

  • evaluators (Union[Evaluator, Collection[Evaluator]]) – An evaluator or a list of evaluators working on batches of triples and corresponding scores.

  • only_size_probing (bool) – The evaluation is only performed for two batches to test the memory footprint, especially on GPUs.

  • batch_size (Optional[int]) – >0 A positive integer used as batch size. Generally chosen as large as possible. Defaults to 1 if None.

  • slice_size (Optional[int]) – >0 The divisor for the scoring function when using slicing.

  • device (Optional[device]) – The device on which the evaluation shall be run. If None is given, use the model’s device.

  • squeeze (bool) – Return a single instance of MetricResults if only one evaluator was given.

  • use_tqdm (bool) – Should a progress bar be displayed?

  • restrict_entities_to (Optional[LongTensor]) – Optionally restrict the evaluation to the given entity IDs. This may be useful if one is only interested in a part of the entities, e.g. due to type constraints, but wants to train on all available data. For ranking the entities, we still compute all scores for all possible replacement entities to avoid irregular access patterns which might decrease performance, but the scores with afterwards be filtered to only keep those of interest. If provided, we assume that the triples are already filtered, such that it only contains the entities of interest.

  • do_time_consuming_checks (bool) – Whether to perform some time consuming checks on the provided arguments. Currently, this encompasses: - If restrict_entities_to is not None, check whether the triples have been filtered. Disabling this option can accelerate the method.

  • additional_filtered_triples (Union[None, LongTensor, List[LongTensor]]) – Additional true triples to filter out during filtered evaluation.

Return type

Union[MetricResults, List[MetricResults]]