BloomFilterer

class BloomFilterer(mapped_triples, error_rate=0.001)[source]

Bases: Filterer

A filterer for negative triples based on the Bloom filter.

Pure PyTorch, a proper module which can be moved to GPU, and support batch-wise computation.

See also

Initialize the Bloom filter based filterer.

Parameters:
  • mapped_triples (LongTensor) – The ID-based triples.

  • error_rate (float) – The desired error rate.

Methods Summary

add(triples)

Add triples to the Bloom filter.

contains(batch)

Check whether a triple is contained.

num_bits(num[, error_rate])

Determine the required number of bits.

num_probes(num_elements, num_bits)

Determine the number of probes / hashing rounds.

probe(batch)

Iterate over indices from the probes.

Methods Documentation

add(triples)[source]

Add triples to the Bloom filter.

Return type:

None

Parameters:

triples (LongTensor) –

contains(batch)[source]

Check whether a triple is contained.

Parameters:

batch (LongTensor) – shape (batch_size, 3) The batch of triples.

Return type:

BoolTensor

Returns:

shape: (batch_size,) The result. False guarantees that the element was not contained in the indexed triples. True can be erroneous.

static num_bits(num, error_rate=0.01)[source]

Determine the required number of bits.

Parameters:
  • num (int) – The number of elements the Bloom filter shall store.

  • error_rate (float) – The desired error rate.

Return type:

int

Returns:

The required number of bits.

static num_probes(num_elements, num_bits)[source]

Determine the number of probes / hashing rounds.

Parameters:
  • num_elements (int) – The number of elements.

  • num_bits (int) – The number of bits, i.e., the size of the Bloom filter.

Returns:

The number of hashing rounds.

probe(batch)[source]

Iterate over indices from the probes.

Parameters:

batch (LongTensor) – shape: (batch_size, 3) A batch of elements.

Yields:

Indices of the k-th round, shape: (batch_size,).

Return type:

Iterable[LongTensor]