TokenizationRepresentation
- class TokenizationRepresentation(assignment, token_representation=None, token_representation_kwargs=None, shape=None, **kwargs)[source]
Bases:
RepresentationA module holding the result of tokenization.
Initialize the tokenization.
- Parameters:
assignment (
LongTensor) – shape: (n, num_chosen_tokens) the token assignment.token_representation (
Union[str,Representation,Type[Representation],None]) – shape: (num_total_tokens, *shape) the token representationstoken_representation_kwargs (
Optional[Mapping[str,Any]]) – additional keyword-based parametersshape (
Union[int,Sequence[int],None]) – The shape of an individual representation. If provided, has to match.kwargs – additional keyword-based parameters passed to
Representation.__init__()
- Raises:
ValueError – if there’s a mismatch between the representation size and the vocabulary size
Attributes Summary
Return the number of selected tokens for ID.
Methods Summary
from_tokenizer(tokenizer, num_tokens, ...[, ...])Create a tokenization from applying a tokenizer.
Iterate over components for
extra_repr().save_assignment(output_path)Save the assignment to a file.
Attributes Documentation
Methods Documentation
- classmethod from_tokenizer(tokenizer, num_tokens, mapped_triples, num_entities, num_relations, token_representation=None, token_representation_kwargs=None, **kwargs)[source]
Create a tokenization from applying a tokenizer.
- Parameters:
tokenizer (
Tokenizer) – the tokenizer instance.num_tokens (
int) – the number of tokens to select for each entity.token_representation (
Union[str,Representation,Type[Representation],None]) – the pre-instantiated token representations, class, or name of a classtoken_representation_kwargs (
Optional[Mapping[str,Any]]) – additional keyword-based parametersmapped_triples (
LongTensor) – the ID-based triplesnum_entities (
int) – the number of entitiesnum_relations (
int) – the number of relationskwargs – additional keyword-based parameters passed to TokenizationRepresentation.__init__
- Return type:
- Returns:
A tokenization representation by applying the tokenizer