AnchorTokenizer
- class AnchorTokenizer(selection: str | AnchorSelection | type[AnchorSelection] | None = None, selection_kwargs: Mapping[str, Any] | None = None, searcher: str | AnchorSearcher | type[AnchorSearcher] | None = None, searcher_kwargs: Mapping[str, Any] | None = None)[source]
Bases:
Tokenizer
Tokenize entities by representing them as a bag of anchor entities.
The entities are chosen by shortest path distance.
Initialize the tokenizer.
- Parameters:
selection (str | AnchorSelection | type[AnchorSelection] | None) – the anchor node selection strategy.
selection_kwargs (Mapping[str, Any] | None) – additional keyword-based arguments passed to the selection strategy
searcher (AnchorSearcher) – the component for searching the closest anchors for each entity
searcher_kwargs (Mapping[str, Any] | None) – additional keyword-based arguments passed to the searcher
Methods Summary
__call__
(mapped_triples, num_tokens, ...)Tokenize the entities contained given the triples.
Methods Documentation
- __call__(mapped_triples: Tensor, num_tokens: int, num_entities: int, num_relations: int) tuple[int, Tensor] [source]
Tokenize the entities contained given the triples.
- Parameters:
- Returns:
shape: (num_entities, num_tokens), -1 <= res < vocabulary_size the selected relation IDs for each entity. -1 is used as a padding token.
- Return type: