SparseBFSSearcher

class SparseBFSSearcher(max_iter: int = 5, device: str | device | None = None)[source]

Find closest anchors using torch_sparse on a GPU.

Initialize the tokenizer.

Parameters:

Methods Summary

`__call__`(edge_index, anchors, k[, num_entities])	Find the \(k\) closest anchor nodes for each entity.
`bfs`(anchors, edge_list, max_iter, k, device)	Determine the candidate pool using breadth-first search.
`create_adjacency`(edge_index[, num_entities])	Create a sparse adjacency matrix (in the form of the edge list) from a given edge index.
`iter_extra_repr`()	Iterate over the components of the `extra_repr()`.
`select`(pool, k)	Select \(k\) anchors from the given pools.

Methods Documentation

__call__(edge_index: ndarray, anchors: ndarray, k: int, num_entities: int | None = None) → ndarray[source]

Find the \(k\) closest anchor nodes for each entity.

Parameters:

Returns:

shape: (n, k), -1 <= res < a the Ids of the closest anchors

Return type:

ndarray

static bfs(anchors: ndarray, edge_list: tensor, max_iter: int, k: int, device: device) → ndarray[source]

Determine the candidate pool using breadth-first search.

Parameters:

anchors (ndarray) – shape: (a,) the anchor node IDs
edge_list (tensor) – shape: (2, n) the edge list with symmetric edges and self-loops
max_iter (int) – the maximum number of hops to consider
k (int) – the minimum number of anchor nodes to reach
device (device) – the device on which the calculations are done

Returns:

shape: (n, a) a boolean array indicating whether anchor \(j\) is in the set of \(k\) closest anchors for node \(i\)

Raises:

ImportError – If torch_sparse is not installed

Return type:

ndarray

static create_adjacency(edge_index: ndarray, num_entities: int | None = None) → tensor[source]

Create a sparse adjacency matrix (in the form of the edge list) from a given edge index.

Parameters:

edge_index (ndarray) – shape: (2, m) the edge index
num_entities (int | None) – The number of entities. If not given, inferred from the edge index

Returns:

shape: (2, 2m + n) edge list with inverse edges and self-loops

Return type:

tensor

iter_extra_repr() → Iterable[str][source]

Iterate over the components of the extra_repr().

This method is typically overridden. A common pattern would be

def iter_extra_repr(self) -> Iterable[str]:
    yield from super().iter_extra_repr()
    yield "<key1>=<value1>"
    yield "<key2>=<value2>"

static select(pool: tensor, k: int) → ndarray[source]

Select \(k\) anchors from the given pools.

Parameters:

pool (tensor) – shape: (n, a) the anchor candidates for each node with distances
k (int) – the number of candidates to select

Returns:

shape: (n, k) the selected anchors. May contain -1 if there is an insufficient number of candidates

Return type:

ndarray