PipelineResult

class PipelineResult(random_seed: int, model: ~pykeen.models.base.Model, training: ~pykeen.triples.triples_factory.CoreTriplesFactory, training_loop: ~pykeen.training.training_loop.TrainingLoop, losses: list[float], metric_results: ~pykeen.evaluation.evaluator.MetricResults, train_seconds: float, evaluate_seconds: float, stopper: ~pykeen.stoppers.stopper.Stopper | None = None, configuration: ~collections.abc.Mapping[str, ~typing.Any] = <factory>, metadata: ~collections.abc.MutableMapping[str, ~typing.Any] = <factory>, version: str = <factory>, git_hash: str = <factory>)[source]

Bases: Result

A dataclass containing the results of running pykeen.pipeline.pipeline().

Attributes Summary

METADATA_FILE_NAME

MODEL_FILE_NAME

RESULT_FILE_NAME

TRAINING_TRIPLES_FILE_NAME

stopper

An early stopper

title

The title of the experiment.

Methods Summary

get_metric(key)

Get the given metric out of the metric result object.

plot(**kwargs)

Plot all plots.

plot_early_stopping(**kwargs)

Plot the evaluations during early stopping.

plot_er(**kwargs)

Plot the reduced entities and relation vectors in 2D.

plot_losses(**kwargs)

Plot the losses per epoch.

save_model(path)

Save the trained model to the given path using torch.save().

save_to_directory(directory, *[, ...])

Save all artifacts in the given directory.

save_to_ftp(directory, ftp)

Save all artifacts to the given directory in the FTP server.

save_to_s3(directory, bucket[, s3])

Save all artifacts to the given directory in an S3 Bucket.

Attributes Documentation

Parameters:
METADATA_FILE_NAME: ClassVar[str] = 'metadata.json'
MODEL_FILE_NAME: ClassVar[str] = 'trained_model.pkl'
RESULT_FILE_NAME: ClassVar[str] = 'results.json'
TRAINING_TRIPLES_FILE_NAME: ClassVar[str] = 'training_triples'
stopper: Stopper | None = None

An early stopper

title

The title of the experiment.

Methods Documentation

get_metric(key: str) float[source]

Get the given metric out of the metric result object.

Parameters:

key (str)

Return type:

float

plot(**kwargs)[source]

Plot all plots.

Parameters:

kwargs – The keyword arguments passed to pykeen.pipeline_plot.plot()

Returns:

The axis

plot_early_stopping(**kwargs)[source]

Plot the evaluations during early stopping.

Parameters:

kwargs – The keyword arguments passed to pykeen.pipeline.plot_utils.plot_early_stopping()

Returns:

The axis

plot_er(**kwargs)[source]

Plot the reduced entities and relation vectors in 2D.

Parameters:

kwargs – The keyword arguments passed to pykeen.pipeline.plot_utils.plot_er()

Returns:

The axis

Warning

Plotting relations and entities on the same plot is only meaningful for translational distance models like TransE.

plot_losses(**kwargs)[source]

Plot the losses per epoch.

Parameters:

kwargs – The keyword arguments passed to pykeen.pipeline.plot_utils.plot_losses().

Returns:

The axis

save_model(path: str | Path) None[source]

Save the trained model to the given path using torch.save().

Parameters:

path (str | Path) – The path to which the model is saved. Should have an extension appropriate for a pickle, like *.pkl or *.pickle.

Return type:

None

The model contains within it the triples factory that was used for training.

save_to_directory(directory: str | Path, *, save_metadata: bool = True, save_replicates: bool = True, save_training: bool = True, **_kwargs) None[source]

Save all artifacts in the given directory.

The serialization format looks as follows

directory/
    results.json
    metadata.json
    trained_model.pkl
    training_triples/

All but the first component are optional and can be disabled, e.g. to save disk space during hyperparameter tuning. trained_model.pkl is the full model saved via torch.save(), and can thus be loaded via torch.load(), cf. torch’s serialization documentation. training_triples contains the training triples factory, including label-to-id mappings, if used. It has been saved via pykeen.triples.CoreTriplesFactory.to_path_binary(), and can re-loaded via pykeen.triples.CoreTriplesFactory.from_path_binary().

Parameters:
  • directory (str | Path) – the directory path. It will be created including all parent directories if necessary

  • save_metadata (bool) – whether to save metadata, cf. PipelineResult.metadata

  • save_replicates (bool) – # TODO: rename param? whether to save the trained model, cf. PipelineResult.save_model()

  • save_training (bool) – whether to save the training triples factory

  • _kwargs – additional keyword-based parameters, which are ignored

Return type:

None

save_to_ftp(directory: str, ftp: FTP) None[source]

Save all artifacts to the given directory in the FTP server.

Parameters:
  • directory (str) – The directory in the FTP server to save to

  • ftp (FTP) – A connection to the FTP server

Return type:

None

The following code will train a model and upload it to FTP using Python’s builtin ftplib.FTP:

import ftplib
from pykeen.pipeline import pipeline

directory = 'test/test'
pipeline_result = pipeline(
    model='TransE',
    dataset='Kinships',
)
with ftplib.FTP(host='0.0.0.0', user='user', passwd='12345') as ftp:
    pipeline_result.save_to_ftp(directory, ftp)

If you want to try this with your own local server, run this code based on the example from Giampaolo Rodola’s excellent library, pyftpdlib.

import os
from pyftpdlib.authorizers import DummyAuthorizer
from pyftpdlib.handlers import FTPHandler
from pyftpdlib.servers import FTPServer

authorizer = DummyAuthorizer()
authorizer.add_user("user", "12345", homedir=os.path.expanduser('~/ftp'), perm="elradfmwMT")

handler = FTPHandler
handler.authorizer = authorizer

address = '0.0.0.0', 21
server = FTPServer(address, handler)
server.serve_forever()
save_to_s3(directory: str, bucket: str, s3=None) None[source]

Save all artifacts to the given directory in an S3 Bucket.

Parameters:
  • directory (str) – The directory in the S3 bucket

  • bucket (str) – The name of the S3 bucket

  • s3 – A client from boto3.client(), if already instantiated

Return type:

None

Note

Need to have ~/.aws/credentials file set up. Read: https://realpython.com/python-boto3-aws-s3/

The following code will train a model and upload it to S3 using boto3:

import time
from pykeen.pipeline import pipeline
pipeline_result = pipeline(
    dataset='Kinships',
    model='TransE',
)
directory = f'tests/{time.strftime("%Y-%m-%d-%H%M%S")}'
bucket = 'pykeen'
pipeline_result.save_to_s3(directory, bucket=bucket)