Hyper-parameter Optimization

class HpoPipelineResult(study, objective)[source]

A container for the results of the HPO pipeline.

Parameters:
  • study (Study) –

  • objective (Objective) –

objective: Objective

The objective class, containing information on preset hyper-parameters and those to optimize

replicate_best_pipeline(*, directory, replicates, move_to_cpu=False, save_replicates=True, save_training=False)[source]

Run the pipeline on the best configuration, but this time on the “test” set instead of “evaluation” set.

Parameters:
  • directory (Union[str, Path]) – Output directory

  • replicates (int) – The number of times to retrain the model

  • move_to_cpu (bool) – Should the model be moved back to the CPU? Only relevant if training on GPU.

  • save_replicates (bool) – Should the artifacts of the replicates be saved?

  • save_training (bool) – Should the training triples be saved?

Raises:

ValueError – if "use_testing_data" is provided in the best pipeline’s config.

Return type:

None

save_to_directory(directory, **kwargs)[source]

Dump the results of a study to the given directory.

Return type:

None

Parameters:

directory (str | Path) –

save_to_ftp(directory, ftp)[source]

Save the results to the directory in an FTP server.

Parameters:
  • directory (str) – The directory in the FTP server to save to

  • ftp (FTP) – A connection to the FTP server

save_to_s3(directory, bucket, s3=None)[source]

Save all artifacts to the given directory in an S3 Bucket.

Parameters:
  • directory (str) – The directory in the S3 bucket

  • bucket (str) – The name of the S3 bucket

  • s3 – A client from boto3.client(), if already instantiated

Return type:

None

study: Study

The optuna study object

hpo_pipeline_from_path(path, **kwargs)[source]

Run a HPO study from the configuration at the given path.

Return type:

HpoPipelineResult

Parameters:

path (str | Path) –

hpo_pipeline_from_config(config, **kwargs)[source]

Run the HPO pipeline using a properly formatted configuration dictionary.

Return type:

HpoPipelineResult

Parameters:

config (Mapping[str, Any]) –

hpo_pipeline(*, dataset=None, dataset_kwargs=None, training=None, testing=None, validation=None, evaluation_entity_whitelist=None, evaluation_relation_whitelist=None, model, model_kwargs=None, model_kwargs_ranges=None, loss=None, loss_kwargs=None, loss_kwargs_ranges=None, regularizer=None, regularizer_kwargs=None, regularizer_kwargs_ranges=None, optimizer=None, optimizer_kwargs=None, optimizer_kwargs_ranges=None, lr_scheduler=None, lr_scheduler_kwargs=None, lr_scheduler_kwargs_ranges=None, training_loop=None, training_loop_kwargs=None, negative_sampler=None, negative_sampler_kwargs=None, negative_sampler_kwargs_ranges=None, epochs=None, training_kwargs=None, training_kwargs_ranges=None, stopper=None, stopper_kwargs=None, evaluator=None, evaluator_kwargs=None, evaluation_kwargs=None, metric=None, filter_validation_when_testing=True, result_tracker=None, result_tracker_kwargs=None, device=None, storage=None, sampler=None, sampler_kwargs=None, pruner=None, pruner_kwargs=None, study_name=None, direction=None, load_if_exists=False, n_trials=None, timeout=None, gc_after_trial=None, n_jobs=None, save_model_directory=None)[source]

Train a model on the given dataset.

Parameters:
Return type:

HpoPipelineResult

Returns:

the optimization result

Raises:

ValueError – if early stopping is enabled, but the number of epochs is to be optimized, too.