CleanupSplitter
- class CleanupSplitter(cleaner: str | Cleaner | type[Cleaner] | None = None)[source]
Bases:
Splitter
The cleanup splitter first randomly splits the triples and then cleans up.
In the cleanup process, triples are moved into the train part until all entities occur at least once in train.
The splitter supports two variants of cleanup, cf.
cleaner_resolver
.Initialize the splitter.
- Parameters:
cleaner (str | Cleaner | type[Cleaner] | None) – the cleanup method to use. Defaults to the fast deterministic cleaner, which may lead to larger deviances between desired and actual triple count.
Methods Summary
split_absolute_size
(mapped_triples, sizes, ...)Split triples into clean groups.
Methods Documentation
- split_absolute_size(mapped_triples: Tensor, sizes: Sequence[int], random_state: Generator) Sequence[Tensor] [source]
Split triples into clean groups.
This method partitions the triples, i.e., each triple is in exactly one group. Moreover, it ensures that the first group contains all entities at least once.
- Parameters:
- Returns:
a sequence of ID-based triples for each split part. the absolute may be different to ensure the constraint.
- Return type: