PipelineResult
- class PipelineResult(random_seed: int, model: ~pykeen.models.base.Model, training: ~pykeen.triples.triples_factory.CoreTriplesFactory, training_loop: ~pykeen.training.training_loop.TrainingLoop, losses: list[float], metric_results: ~pykeen.evaluation.evaluator.MetricResults, train_seconds: float, evaluate_seconds: float, stopper: ~pykeen.stoppers.stopper.Stopper | None = None, configuration: ~collections.abc.Mapping[str, ~typing.Any] = <factory>, metadata: ~collections.abc.MutableMapping[str, ~typing.Any] = <factory>, version: str = <factory>, git_hash: str = <factory>)[source]
Bases:
ResultA dataclass containing the results of running
pykeen.pipeline.pipeline().Attributes Summary
The configuration
How long in seconds did evaluation take?
The git hash of PyKEEN used to create these results
The losses during training
Any additional metadata as a dictionary
The results evaluated by the pipeline
The model trained by the pipeline
The random seed used at the beginning of the pipeline
An early stopper
The title of the experiment.
How long in seconds did training take?
The training triples
The training loop used by the pipeline
The version of PyKEEN used to create these results
Methods Summary
get_metric(key)Get the given metric out of the metric result object.
plot(**kwargs)Plot all plots.
plot_early_stopping(**kwargs)Plot the evaluations during early stopping.
plot_er(**kwargs)Plot the reduced entities and relation vectors in 2D.
plot_losses(**kwargs)Plot the losses per epoch.
save_model(path)Save the trained model to the given path using
torch.save().save_to_directory(directory, *[, ...])Save all artifacts in the given directory.
save_to_ftp(directory, ftp)Save all artifacts to the given directory in the FTP server.
save_to_s3(directory, bucket[, s3])Save all artifacts to the given directory in an S3 Bucket.
Attributes Documentation
- Parameters:
random_seed (int)
model (Model)
training (CoreTriplesFactory)
training_loop (TrainingLoop)
metric_results (MetricResults)
train_seconds (float)
evaluate_seconds (float)
stopper (Stopper | None)
metadata (MutableMapping[str, Any])
version (str)
git_hash (str)
- evaluate_seconds: float = <dataclasses._MISSING_TYPE object>
How long in seconds did evaluation take?
- git_hash: str = <dataclasses._MISSING_TYPE object>
The git hash of PyKEEN used to create these results
- metadata: MutableMapping[str, Any] = <dataclasses._MISSING_TYPE object>
Any additional metadata as a dictionary
- metric_results: MetricResults = <dataclasses._MISSING_TYPE object>
The results evaluated by the pipeline
- random_seed: int = <dataclasses._MISSING_TYPE object>
The random seed used at the beginning of the pipeline
- title
The title of the experiment.
- training: CoreTriplesFactory = <dataclasses._MISSING_TYPE object>
The training triples
- training_loop: TrainingLoop = <dataclasses._MISSING_TYPE object>
The training loop used by the pipeline
- version: str = <dataclasses._MISSING_TYPE object>
The version of PyKEEN used to create these results
Methods Documentation
- plot(**kwargs)[source]
Plot all plots.
- Parameters:
kwargs – The keyword arguments passed to
pykeen.pipeline_plot.plot()- Returns:
The axis
- plot_early_stopping(**kwargs)[source]
Plot the evaluations during early stopping.
- Parameters:
kwargs – The keyword arguments passed to
pykeen.pipeline.plot_utils.plot_early_stopping()- Returns:
The axis
- plot_er(**kwargs)[source]
Plot the reduced entities and relation vectors in 2D.
- Parameters:
kwargs – The keyword arguments passed to
pykeen.pipeline.plot_utils.plot_er()- Returns:
The axis
Warning
Plotting relations and entities on the same plot is only meaningful for translational distance models like TransE.
- plot_losses(**kwargs)[source]
Plot the losses per epoch.
- Parameters:
kwargs – The keyword arguments passed to
pykeen.pipeline.plot_utils.plot_losses().- Returns:
The axis
- save_model(path: str | Path) None[source]
Save the trained model to the given path using
torch.save().- Parameters:
path (str | Path) – The path to which the model is saved. Should have an extension appropriate for a pickle, like *.pkl or *.pickle.
- Return type:
None
The model contains within it the triples factory that was used for training.
- save_to_directory(directory: str | Path, *, save_metadata: bool = True, save_replicates: bool = True, save_training: bool = True, **_kwargs) None[source]
Save all artifacts in the given directory.
The serialization format looks as follows
directory/ results.json metadata.json trained_model.pkl training_triples/
All but the first component are optional and can be disabled, e.g. to save disk space during hyperparameter tuning. trained_model.pkl is the full model saved via
torch.save(), and can thus be loaded viatorch.load(), cf. torch’s serialization documentation. training_triples contains the training triples factory, including label-to-id mappings, if used. It has been saved viapykeen.triples.CoreTriplesFactory.to_path_binary(), and can re-loaded viapykeen.triples.CoreTriplesFactory.from_path_binary().- Parameters:
directory (str | Path) – the directory path. It will be created including all parent directories if necessary
save_metadata (bool) – whether to save metadata, cf.
PipelineResult.metadatasave_replicates (bool) – # TODO: rename param? whether to save the trained model, cf.
PipelineResult.save_model()save_training (bool) – whether to save the training triples factory
_kwargs – additional keyword-based parameters, which are ignored
- Return type:
None
- save_to_ftp(directory: str | Path, ftp: FTP) None[source]
Save all artifacts to the given directory in the FTP server.
- Parameters:
- Return type:
None
The following code will train a model and upload it to FTP using Python’s builtin
ftplib.FTP:import ftplib from pykeen.pipeline import pipeline directory = 'test/test' pipeline_result = pipeline( model='TransE', dataset='Kinships', ) with ftplib.FTP(host='0.0.0.0', user='user', passwd='12345') as ftp: pipeline_result.save_to_ftp(directory, ftp)
If you want to try this with your own local server, run this code based on the example from Giampaolo Rodola’s excellent library, pyftpdlib.
import os from pyftpdlib.authorizers import DummyAuthorizer from pyftpdlib.handlers import FTPHandler from pyftpdlib.servers import FTPServer authorizer = DummyAuthorizer() authorizer.add_user("user", "12345", homedir=os.path.expanduser('~/ftp'), perm="elradfmwMT") handler = FTPHandler handler.authorizer = authorizer address = '0.0.0.0', 21 server = FTPServer(address, handler) server.serve_forever()
- save_to_s3(directory: str | Path, bucket: str, s3=None) None[source]
Save all artifacts to the given directory in an S3 Bucket.
- Parameters:
bucket (str) – The name of the S3 bucket
s3 – A client from
boto3.client(), if already instantiated
- Return type:
None
Note
Need to have
~/.aws/credentialsfile set up. Read: https://realpython.com/python-boto3-aws-s3/The following code will train a model and upload it to S3 using
boto3:import time from pykeen.pipeline import pipeline pipeline_result = pipeline( dataset='Kinships', model='TransE', ) directory = f'tests/{time.strftime("%Y-%m-%d-%H%M%S")}' bucket = 'pykeen' pipeline_result.save_to_s3(directory, bucket=bucket)