# InverseHarmonicMeanRank

class InverseHarmonicMeanRank[source]

The inverse harmonic mean rank.

The mean reciprocal rank (MRR) is the arithmetic mean of reciprocal ranks, and thus the inverse of the harmonic mean of the ranks. It is defined as:

$IHMR = MRR =\frac{1}{|\mathcal{I}|} \sum_{r \in \mathcal{I}} r^{-1}$

Warning

It has been argued that the mean reciprocal rank has theoretical flaws by [fuhr2018]. However, this opinion is not undisputed, cf. [sakai2021].

Despite its flaws, MRR is still often used during early stopping due to its behavior related to low rank values. While the hits @ k ignores changes among high rank values completely and the mean rank changes uniformly across the full value range, the mean reciprocal rank is more affected by changes of low rank values than high ones (without disregarding them completely like hits @ k does for low rank values) Therefore, it can be considered as soft a version of hits @ k that is less sensitive to outliers. It is bound on $$(0, 1]$$ where closer to 1 is better.

Let

$H_m(n) = \sum \limits_{i=1}^{n} i^{-m}$

denote the generalized harmonic number, with $$H(n) := H_{1}(n)$$ for brevity. Thus, we have

$\mathbb{E}\left[r_i^{-1}\right] = \frac{H(N_i)}{N_i}$

and hence

$\begin{split}\mathbb{E}\left[\textrm{MRR}\right] &= \mathbb{E}\left[\frac{1}{n} \sum \limits_{i=1}^n r_i^{-1}\right] \\ &= \frac{1}{n} \sum \limits_{i=1}^n \mathbb{E}\left[r_i^{-1}\right] \\ &= \frac{1}{n} \sum \limits_{i=1}^n \frac{H(N_i)}{N_i}\end{split}$

For the variance, we have for the individual ranks

$\begin{split}\mathbb{V}\left[r_i^{-1}\right] &= \frac{1}{N_i} \sum \limits_{i=1}^{N_i} \left(\frac{H(N_i)}{N_i} - \frac{1}{i}\right)^2 \\ &= \frac{N_i \cdot H_2(N_i) - H(N_i)^2}{N_i^2}\end{split}$

and thus overall

$\begin{split}\mathbb{V}\left[\textrm{MRR}\right] &= \mathbb{V}\left[\frac{1}{n} \sum \limits_{i=1}^n r_i^{-1}\right] \\ &= \frac{1}{n^2} \sum \limits_{i=1}^n \mathbb{V}\left[r_i^{-1}\right] \\ &= \frac{1}{n^2} \sum \limits_{i=1}^n \frac{N_i \cdot H_2(N_i) - H(N_i)^2}{N_i^2} \\\end{split}$

Attributes Summary

 binarize whether the metric needs binarized scores closed_expectation whether there is a closed-form solution of the expectation closed_variance whether there is a closed-form solution of the variance increasing whether it is increasing, i.e., larger values are better key Return the key for use in metric result dictionaries. name The name of the metric needs_candidates whether the metric requires the number of candidates for each ranking task supported_rank_types the supported rank types. supports_weights whether the metric supports weights synonyms synonyms for this metric value_range the value range

Methods Summary

 __call__(ranks[, num_candidates, weights]) Evaluate the metric. expected_value(num_candidates[, ...]) Compute expected metric value. Generate the extra repr, cf. Get the description. Get the link from the docdata. Get the math notation for the range of this metric. get_sampled_values(num_candidates, num_samples) Calculate the metric on sampled rank arrays. Iterate over the components of the extra_repr(). numeric_expected_value(**kwargs) Compute expected metric value by summation. numeric_expected_value_with_ci(**kwargs) Estimate expected value with confidence intervals. numeric_variance(**kwargs) Compute variance by summation. numeric_variance_with_ci(**kwargs) Estimate variance with confidence intervals. std(num_candidates[, num_samples, weights]) Compute the standard deviation. variance(num_candidates[, num_samples, weights]) Compute variance.

Attributes Documentation

binarize: ClassVar[bool] = False

whether the metric needs binarized scores

closed_expectation: ClassVar[bool] = True

whether there is a closed-form solution of the expectation

closed_variance: ClassVar[bool] = True

whether there is a closed-form solution of the variance

increasing: ClassVar[bool] = True

whether it is increasing, i.e., larger values are better

key

Return the key for use in metric result dictionaries.

Return type:

str

name: ClassVar[str] = 'Mean Reciprocal Rank (MRR)'

The name of the metric

needs_candidates: ClassVar[bool] = False

whether the metric requires the number of candidates for each ranking task

supported_rank_types: ClassVar[Collection[Literal['optimistic', 'realistic', 'pessimistic']]] = ('optimistic', 'realistic', 'pessimistic')

the supported rank types. Most of the time equal to all rank types

supports_weights: ClassVar[bool] = True

whether the metric supports weights

synonyms: ClassVar[Collection[str]] = ('mean_reciprocal_rank', 'mrr')

synonyms for this metric

value_range: ClassVar[ValueRange] = ValueRange(lower=0, lower_inclusive=False, upper=1, upper_inclusive=True)

the value range

Methods Documentation

__call__(ranks, num_candidates=None, weights=None)[source]

Evaluate the metric.

Parameters:
Return type:

float

expected_value(num_candidates, num_samples=None, weights=None, **kwargs)[source]

Compute expected metric value.

The expectation is computed under the assumption that each individual rank follows a discrete uniform distribution $$\mathcal{U}\left(1, N_i\right)$$, where $$N_i$$ denotes the number of candidates for ranking task $$r_i$$.

Parameters:
Return type:

float

Returns:

the expected value of this metric

Raises:

NoClosedFormError – raised if a closed form expectation has not been implemented and no number of samples are given

Note

Prefers analytical solution, if available, but falls back to numeric estimation via summation, cf. RankBasedMetric.numeric_expected_value().

extra_repr()

Generate the extra repr, cf. :methtorch.nn.Module.extra_repr.

Return type:

str

Returns:

the extra part of the repr()

classmethod get_description()

Get the description.

Return type:

str

Get the link from the docdata.

Return type:

str

classmethod get_range()

Get the math notation for the range of this metric.

Return type:

str

get_sampled_values(num_candidates, num_samples, weights=None, generator=None, memory_intense=True)

Calculate the metric on sampled rank arrays.

Parameters:
Return type:

ndarray

Returns:

shape: (num_samples,) the metric evaluated on num_samples sampled rank arrays

iter_extra_repr()

Iterate over the components of the extra_repr().

This method is typically overridden. A common pattern would be

def iter_extra_repr(self) -> Iterable[str]:
yield from super().iter_extra_repr()
yield "<key1>=<value1>"
yield "<key2>=<value2>"

Return type:
Returns:

an iterable over individual components of the extra_repr()

numeric_expected_value(**kwargs)

Compute expected metric value by summation.

The expectation is computed under the assumption that each individual rank follows a discrete uniform distribution $$\mathcal{U}\left(1, N_i\right)$$, where $$N_i$$ denotes the number of candidates for ranking task $$r_i$$.

Parameters:

kwargs – keyword-based parameters passed to get_sampled_values()

Return type:

float

Returns:

The estimated expected value of this metric

Warning

Depending on the metric, the estimate may not be very accurate and converge slowly, cf. https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_discrete.expect.html

numeric_expected_value_with_ci(**kwargs)

Estimate expected value with confidence intervals.

Return type:

ndarray

numeric_variance(**kwargs)

Compute variance by summation.

The variance is computed under the assumption that each individual rank follows a discrete uniform distribution $$\mathcal{U}\left(1, N_i\right)$$, where $$N_i$$ denotes the number of candidates for ranking task $$r_i$$.

Parameters:

kwargs – keyword-based parameters passed to get_sampled_values()

Return type:

float

Returns:

The estimated variance of this metric

Warning

Depending on the metric, the estimate may not be very accurate and converge slowly, cf. https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_discrete.expect.html

numeric_variance_with_ci(**kwargs)

Estimate variance with confidence intervals.

Return type:

ndarray

std(num_candidates, num_samples=None, weights=None, **kwargs)

Compute the standard deviation.

Parameters:
Return type:

float

Returns:

The standard deviation (i.e. the square root of the variance) of this metric

For a detailed explanation, cf. RankBasedMetric.variance().

variance(num_candidates, num_samples=None, weights=None, **kwargs)[source]

Compute variance.

The variance is computed under the assumption that each individual rank follows a discrete uniform distribution $$\mathcal{U}\left(1, N_i\right)$$, where $$N_i$$ denotes the number of candidates for ranking task $$r_i$$.

Parameters:
Return type:

float

Returns:

The variance of this metric

Raises:

NoClosedFormError – Raised if a closed form variance has not been implemented and no number of samples are given

Note

Prefers analytical solution, if available, but falls back to numeric estimation via summation, cf. RankBasedMetric.numeric_variance().