HashDiversityInfo

class HashDiversityInfo(uniques_per_representation: list[float], uniques_total: float)[source]

Bases: NamedTuple

A ratio information object.

A pair unique_per_repr, unique_total, where unique_per_repr is a list with the percentage of unique hashes for each token representation, and unique_total the frequency of unique hashes when we concatenate all token representations.

Create new instance of HashDiversityInfo(uniques_per_representation, uniques_total)

Attributes Summary

uniques_per_representation

A list with ratios per representation in their creation order, e.g., [0.58, 0.82] for AnchorTokenization and RelationTokenization

uniques_total

A scalar ratio of unique rows when combining all representations into one matrix, e.g. 0.95.

Attributes Documentation

Parameters:
uniques_per_representation: list[float]

A list with ratios per representation in their creation order, e.g., [0.58, 0.82] for AnchorTokenization and RelationTokenization

uniques_total: float

A scalar ratio of unique rows when combining all representations into one matrix, e.g. 0.95