Pipeline¶
-
pykeen.pipeline.
pipeline
(*, dataset=None, dataset_kwargs=None, training_triples_factory=None, testing_triples_factory=None, validation_triples_factory=None, model, model_kwargs=None, loss=None, loss_kwargs=None, regularizer=None, regularizer_kwargs=None, optimizer=None, optimizer_kwargs=None, clear_optimizer=True, training_loop=None, negative_sampler=None, negative_sampler_kwargs=None, training_kwargs=None, stopper=None, stopper_kwargs=None, evaluator=None, evaluator_kwargs=None, evaluation_kwargs=None, mlflow_tracking_uri=None, metadata=None, device=None, random_seed=None, use_testing_data=True)[source]¶ Train and evaluate a model.
- Parameters
dataset (
Union
[None
,str
,Type
[DataSet
]]) – The name of the dataset (a key frompykeen.datasets.datasets
) or thepykeen.datasets.DataSet
instance. Alternatively, thetraining_triples_factory
andtesting_triples_factory
can be specified.dataset_kwargs (
Optional
[Mapping
[str
,Any
]]) – The keyword arguments passed to the dataset upon instantiationtraining_triples_factory (
Optional
[TriplesFactory
]) – A triples factory with training instances if a a dataset was not specifiedtesting_triples_factory (
Optional
[TriplesFactory
]) – A triples factory with training instances if a dataset was not specifiedvalidation_triples_factory (
Optional
[TriplesFactory
]) – A triples factory with validation instances if a dataset was not specifiedmodel (
Union
[str
,Type
[Model
]]) – The name of the model or the model classmodel_kwargs (
Optional
[Mapping
[str
,Any
]]) – Keyword arguments to pass to the model class on instantiationloss (
Union
[None
,str
,Type
[_Loss
]]) – The name of the loss or the loss class.loss_kwargs (
Optional
[Mapping
[str
,Any
]]) – Keyword arguments to pass to the loss on instantiationregularizer (
Union
[None
,str
,Type
[Regularizer
]]) – The name of the regularizer or the regularizer class.regularizer_kwargs (
Optional
[Mapping
[str
,Any
]]) – Keyword arguments to pass to the regularizer on instantiationoptimizer (
Union
[None
,str
,Type
[Optimizer
]]) – The name of the optimizer or the optimizer class. Defaults totorch.optim.Adagrad
.optimizer_kwargs (
Optional
[Mapping
[str
,Any
]]) – Keyword arguments to pass to the optimizer on instantiationclear_optimizer (
bool
) – Whether to delete the optimizer instance after training. As the optimizer might have additional memory consumption due to e.g. moments in Adam, this is the default option. If you want to continue training, you should set it to False, as the optimizer’s internal parameter will get lost otherwise.training_loop (
Union
[None
,str
,Type
[TrainingLoop
]]) – The name of the training loop’s training approach ('slcwa'
or'lcwa'
) or the training loop class. Defaults topykeen.training.SLCWATrainingLoop
.negative_sampler (
Union
[None
,str
,Type
[NegativeSampler
]]) – The name of the negative sampler ('basic'
or'bernoulli'
) or the negative sampler class. Only allowed when training with sLCWA. Defaults topykeen.sampling.BasicNegativeSampler
.negative_sampler_kwargs (
Optional
[Mapping
[str
,Any
]]) – Keyword arguments to pass to the negative sampler class on instantiationtraining_kwargs (
Optional
[Mapping
[str
,Any
]]) – Keyword arguments to pass to the training loop’s train function on callstopper (
Union
[None
,str
,Type
[Stopper
]]) – What kind of stopping to use. Default to no stopping, can be set to ‘early’.stopper_kwargs (
Optional
[Mapping
[str
,Any
]]) – Keyword arguments to pass to the stopper upon instantiation.evaluator (
Union
[None
,str
,Type
[Evaluator
]]) – The name of the evaluator or an evaluator class. Defaults topykeen.evaluation.RankBasedEvaluator
.evaluator_kwargs (
Optional
[Mapping
[str
,Any
]]) – Keyword arguments to pass to the evaluator on instantiationevaluation_kwargs (
Optional
[Mapping
[str
,Any
]]) – Keyword arguments to pass to the evaluator’s evaluate function on callmlflow_tracking_uri (
Optional
[str
]) – The MLFlow tracking URL. If None is given, MLFlow is not used to track results.metadata (
Optional
[Dict
[str
,Any
]]) – A JSON dictionary to store with the experimentuse_testing_data (
bool
) – If true, use the testing triples. Otherwise, use the validation triples. Defaults to true - use testing triples.
- Return type
-
class
pykeen.pipeline.
PipelineResult
(random_seed, model, training_loop, losses, metric_results, train_seconds, evaluate_seconds, stopper=None, metadata=<factory>, version=<factory>, git_hash=<factory>)[source]¶ A dataclass containing the results of running
pykeen.pipeline.pipeline()
.-
metric_results
: pykeen.evaluation.evaluator.MetricResults¶ The results evaluated by the pipeline
-
model
: pykeen.models.base.Model¶ The model trained by the pipeline
-
save_model
(path)[source]¶ Save the trained model to the given path using
torch.save()
.The model contains within it the triples factory that was used for training.
- Return type
-
save_to_directory
(directory, save_metadata=True, save_replicates=True)[source]¶ Save all artifacts in the given directory.
- Return type
-
stopper
: Optional[pykeen.stoppers.stopper.Stopper] = None¶ An early stopper
-
training_loop
: pykeen.training.training_loop.TrainingLoop¶ The training loop used by the pipeline
-