TokenizationRepresentation
- class TokenizationRepresentation(assignment, token_representation=None, token_representation_kwargs=None, **kwargs)[source]
Bases:
pykeen.nn.representation.Representation
A module holding the result of tokenization.
Initialize the tokenization.
- Parameters
assignment (
LongTensor
) – shape: (n, num_chosen_tokens) the token assignment.token_representation (
Union
[str
,Representation
,Type
[Representation
],None
]) – shape: (num_total_tokens, *shape) the token representationstoken_representation_kwargs (
Optional
[Mapping
[str
,Any
]]) – additional keyword-based parameterskwargs – additional keyword-based parameters passed to super.__init__
- Raises
ValueError – if there’s a mismatch between the representation size and the vocabulary size
Methods Summary
Set the extra representation of the module
from_tokenizer
(tokenizer, num_tokens, ...[, ...])Create a tokenization from applying a tokenizer.
Methods Documentation
- extra_repr()[source]
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
- Return type
- classmethod from_tokenizer(tokenizer, num_tokens, mapped_triples, num_entities, num_relations, token_representation=None, token_representation_kwargs=None, **kwargs)[source]
Create a tokenization from applying a tokenizer.
- Parameters
tokenizer (
Tokenizer
) – the tokenizer instance.num_tokens (
int
) – the number of tokens to select for each entity.token_representation (
Union
[str
,Representation
,Type
[Representation
],None
]) – the pre-instantiated token representations, or an EmbeddingSpecification to create themtoken_representation_kwargs (
Optional
[Mapping
[str
,Any
]]) – additional keyword-based parametersmapped_triples (
LongTensor
) – the ID-based triplesnum_entities (
int
) – the number of entitiesnum_relations (
int
) – the number of relationskwargs – additional keyword-based parameters passed to TokenizationRepresentation.__init__
- Return type
- Returns
A tokenization representation by applying the tokenizer