- pipeline(*, dataset=None, dataset_kwargs=None, training=None, testing=None, validation=None, evaluation_entity_whitelist=None, evaluation_relation_whitelist=None, model=None, model_kwargs=None, interaction=None, interaction_kwargs=None, dimensions=None, loss=None, loss_kwargs=None, regularizer=None, regularizer_kwargs=None, optimizer=None, optimizer_kwargs=None, clear_optimizer=True, training_loop=None, training_loop_kwargs=None, negative_sampler=None, negative_sampler_kwargs=None, training_kwargs=None, stopper=None, stopper_kwargs=None, evaluator=None, evaluator_kwargs=None, evaluation_kwargs=None, result_tracker=None, result_tracker_kwargs=None, metadata=None, device=None, random_seed=None, use_testing_data=True, evaluation_fallback=False, filter_validation_when_testing=True)¶
Train and evaluate a model.
Dataset]]) – The name of the dataset (a key for the
pykeen.datasets.dataset_resolver) or the
pykeen.datasets.Datasetinstance. Alternatively, the training triples factory (
training), testing triples factory (
testing), and validation triples factory (
validation; optional) can be specified.
str]]) – Optional restriction of evaluation to triples containing only these entities. Useful if the downstream task is only interested in certain entities, but the relational patterns with other entities improve the entity embedding quality.
str]]) – Optional restriction of evaluation to triples containing only these relations. Useful if the downstream task is only interested in certain relation, but the relational patterns with other relations improve the entity embedding quality.
Interaction]]) – The name of the interaction class, a subclass of
pykeen.nn.modules.Interaction, or an instance of
pykeen.nn.modules.Interaction. Can not be given when there is also a model.
bool) – Whether to delete the optimizer instance after training. As the optimizer might have additional memory consumption due to e.g. moments in Adam, this is the default option. If you want to continue training, you should set it to False, as the optimizer’s internal parameter will get lost otherwise.
None]) – The name of the negative sampler (
'bernoulli') or the negative sampler class. Only allowed when training with sLCWA. Defaults to
bool) – If true, use the testing triples. Otherwise, use the validation triples. Defaults to true - use testing triples.
int]) – The random seed to use. If none is specified, one will be assigned before any code is run for reproducibility purposes. In the returned
PipelineResultinstance, it can be accessed through
bool) – If true, in cases where the evaluation failed using the GPU it will fall back to using a smaller batch size or in the last instance evaluate on the CPU, if even the smallest possible batch size is too big for the GPU.
bool) – If true, during the evaluating of the test dataset, validation triples are added to the set of known positive triples, which are filtered out when performing filtered evaluation following the approach described by [bordes2013]. This should be explicitly set to false only in the scenario that you are training a single model using the pipeline and evaluating with the testing set, but never using the validation set for optimization at all. This is a very atypical scenario, so it is left as true by default to promote comparability to previous publications.
- Return type
A pipeline result package.