RelationTokenizer

class RelationTokenizer[source]

Bases: Tokenizer

Tokenize entities by representing them as a bag of relations.

Methods Summary

__call__(mapped_triples, num_tokens, ...)

Tokenize the entities contained given the triples.

Methods Documentation

__call__(mapped_triples, num_tokens, num_entities, num_relations)[source]

Tokenize the entities contained given the triples.

Parameters:
  • mapped_triples (LongTensor) – shape: (n, 3) the ID-based triples

  • num_tokens (int) – the number of tokens to select for each entity

  • num_entities (int) – the number of entities

  • num_relations (int) – the number of relations

Return type:

Tuple[int, LongTensor]

Returns:

shape: (num_entities, num_tokens), -1 <= res < vocabulary_size the selected relation IDs for each entity. -1 is used as a padding token.