Ablation
- ablation_pipeline(datasets, directory, models, losses, optimizers, training_loops, *, epochs=None, create_inverse_triples=False, regularizers=None, negative_sampler=None, evaluator=None, stopper='NopStopper', model_to_model_kwargs=None, model_to_model_kwargs_ranges=None, model_to_loss_to_loss_kwargs=None, model_to_loss_to_loss_kwargs_ranges=None, model_to_optimizer_to_optimizer_kwargs=None, model_to_optimizer_to_optimizer_kwargs_ranges=None, model_to_negative_sampler_to_negative_sampler_kwargs=None, model_to_negative_sampler_to_negative_sampler_kwargs_ranges=None, model_to_training_loop_to_training_loop_kwargs=None, model_to_training_loop_to_training_kwargs=None, model_to_training_loop_to_training_kwargs_ranges=None, model_to_regularizer_to_regularizer_kwargs=None, model_to_regularizer_to_regularizer_kwargs_ranges=None, evaluator_kwargs=None, evaluation_kwargs=None, stopper_kwargs=None, n_trials=5, timeout=3600, metric='hits@10', direction='maximize', sampler='random', pruner='nop', metadata=None, save_artifacts=True, move_to_cpu=True, dry_run=False, best_replicates=None, discard_replicates=False, create_unique_subdir=False)[source]
Run ablation study.
- Parameters
datasets (
Union
[str
,List
[str
]]) – A dataset name or list of dataset names.directory (
Union
[str
,Path
]) – The directory in which the experimental artifacts will be saved.models (
Union
[str
,List
[str
]]) – A model name or list of model names.losses (
Union
[str
,List
[str
]]) – A loss function name or list of loss function names.optimizers (
Union
[str
,List
[str
]]) – An optimizer name or list of optimizer names.training_loops (
Union
[str
,List
[str
]]) – A training loop name or list of training loop names.epochs (
Optional
[int
]) – A quick way to set thenum_epochs
in the training kwargs.create_inverse_triples (
Union
[bool
,List
[bool
]]) – Either a boolean for a single entry or a list of booleans.regularizers (
Union
[None
,str
,List
[str
]]) – A regularizer name, list of regularizer names, or None if no regularizer is desired.negative_sampler (
Optional
[str
]) – A negative sampler name, list of regularizer names, or None if no negative sampler is desired. Negative sampling is used only in combination withpykeen.training.SLCWATrainingLoop
.evaluator (
Optional
[str
]) – The name of the evaluator to be used. Defaults to rank-based evaluator.stopper (
Optional
[str
]) – The name of the stopper to be used. Defaults to NopStopper which doesn’t define a stopping criterion.model_to_model_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Any
]]]) – A mapping from model name to dictionaries of default keyword arguments for the instantiation of that model.model_to_model_kwargs_ranges (
Optional
[Mapping
[str
,Mapping
[str
,Any
]]]) – A mapping from model name to dictionaries of keyword argument ranges for that model to be used in HPO.model_to_loss_to_loss_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of loss name to a mapping of default keyword arguments for the instantiation of that loss function. This is useful because for some losses, have hyper-parameters such aspykeen.losses.MarginRankingLoss
.model_to_loss_to_loss_kwargs_ranges (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of loss name to a mapping of keyword argument ranges for that loss to be used in HPO.model_to_optimizer_to_optimizer_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of optimizer name to a mapping of default keyword arguments for the instantiation of that optimizer. This is useful because the optimizers, have hyper-parameters such as the learning rate.model_to_optimizer_to_optimizer_kwargs_ranges (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of optimizer name to a mapping of keyword argument ranges for that optimizer to be used in HPO.model_to_regularizer_to_regularizer_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of regularizer name to a mapping of default keyword arguments for the instantiation of that regularizer. This is useful because the optimizers, have hyper-parameters such as the regularization weight.model_to_regularizer_to_regularizer_kwargs_ranges (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of regularizer name to a mapping of keyword argument ranges for that regularizer to be used in HPO.model_to_negative_sampler_to_negative_sampler_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of negative sampler name to a mapping of default keyword arguments for the instantiation of that negative sampler. This is useful because the negative samplers, have hyper-parameters such as the number of negatives that should get generated for each positive training example.model_to_negative_sampler_to_negative_sampler_kwargs_ranges (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of negative sampler name to a mapping of keyword argument ranges for that negative sampler to be used in HPO.model_to_training_loop_to_training_loop_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of training loop name to a mapping of default keyword arguments for the training loop.model_to_training_loop_to_training_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of trainer name to a mapping of default keyword arguments for the training procedure. This is useful because you can set the hyper-parameters such as the number of training epochs and the batch size.model_to_training_loop_to_training_kwargs_ranges (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of trainer name to a mapping of keyword argument ranges for that trainer to be used in HPO.evaluator_kwargs (
Optional
[Mapping
[str
,Any
]]) – The keyword arguments passed to the evaluator.evaluation_kwargs (
Optional
[Mapping
[str
,Any
]]) – The keyword arguments passed during evaluation.stopper_kwargs (
Optional
[Mapping
[str
,Any
]]) – The keyword arguments passed to the stopper.timeout (
Optional
[int
]) – The time (seconds) after which the ablation study will be terminated.direction (
Optional
[str
]) – Defines, whether to ‘maximize’ or ‘minimize’ the metric during HPO.sampler (
Optional
[str
]) – The HPO sampler, it defaults to random search.pruner (
Optional
[str
]) – Defines approach for pruning trials. Per default no pruning is used, i.e., pruner is set to ‘Nopruner’.metadata (
Optional
[Mapping
]) – A mapping of meta data arguments such as name of the ablation study.save_artifacts (
bool
) – Defines, whether each trained model sampled during HPO should be saved.move_to_cpu (
bool
) – Defines, whether a replicate of the best model should be moved to CPU.dry_run (
bool
) – Defines whether only the configurations for the single experiments should be created without running them.best_replicates (
Optional
[int
]) – Defines how often the final model should be re-trained and evaluated based on the best hyper-parameters enabling to measure the variance in performance.discard_replicates (
bool
) – Defines, whether the best model should be discarded after training and evaluation.create_unique_subdir (
bool
) – Defines, whether a unique sub-directory for the experimental artifacts should be created. The sub-directory name is defined by the current data + a unique id.
- prepare_ablation_from_config(config, directory, save_artifacts)[source]
Prepare a set of ablation study directories.
- Parameters
config (
Mapping
[str
,Any
]) – Dictionary defining the ablation studies.directory (
Union
[str
,Path
]) – The directory in which the experimental artifacts (including the ablation configurations) will be saved.save_artifacts (
bool
) – Defines, whether the output directories for the trained models sampled during HPO should be created.
- Return type
- Returns
pairs of output directories and HPO config paths inside those directories
- prepare_ablation(datasets, models, losses, optimizers, training_loops, directory, *, epochs=None, create_inverse_triples=False, regularizers=None, negative_sampler=None, evaluator=None, model_to_model_kwargs=None, model_to_model_kwargs_ranges=None, model_to_loss_to_loss_kwargs=None, model_to_loss_to_loss_kwargs_ranges=None, model_to_optimizer_to_optimizer_kwargs=None, model_to_optimizer_to_optimizer_kwargs_ranges=None, model_to_training_loop_to_training_loop_kwargs=None, model_to_neg_sampler_to_neg_sampler_kwargs=None, model_to_neg_sampler_to_neg_sampler_kwargs_ranges=None, model_to_training_loop_to_training_kwargs=None, model_to_training_loop_to_training_kwargs_ranges=None, model_to_regularizer_to_regularizer_kwargs=None, model_to_regularizer_to_regularizer_kwargs_ranges=None, n_trials=5, timeout=3600, metric='hits@10', direction='maximize', sampler='random', pruner='nop', evaluator_kwargs=None, evaluation_kwargs=None, stopper='NopStopper', stopper_kwargs=None, metadata=None, save_artifacts=True)[source]
Prepare an ablation directory.
- Parameters
datasets (
Union
[str
,List
[str
]]) – A dataset name or list of dataset names.models (
Union
[str
,List
[str
]]) – A model name or list of model names.losses (
Union
[str
,List
[str
]]) – A loss function name or list of loss function names.optimizers (
Union
[str
,List
[str
]]) – An optimizer name or list of optimizer names.training_loops (
Union
[str
,List
[str
]]) – A training loop name or list of training loop names.epochs (
Optional
[int
]) – A quick way to set thenum_epochs
in the training kwargs.create_inverse_triples (
Union
[bool
,List
[bool
]]) – Either a boolean for a single entry or a list of booleans.regularizers (
Union
[None
,str
,List
[str
],List
[None
]]) – A regularizer name, list of regularizer names, or None if no regularizer is desired.negative_sampler (
Optional
[str
]) – A negative sampler name, list of regularizer names, or None if no negative sampler is desired. Negative sampling is used only in combination with the pykeen.training.sclwa training loop.evaluator (
Optional
[str
]) – The name of the evaluator to be used. Defaults to rank-based evaluator.stopper (
Optional
[str
]) – The name of the stopper to be used. Defaults to NopStopper which doesn’t define a stopping criterion.model_to_model_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Any
]]]) – A mapping from model name to dictionaries of default keyword arguments for the instantiation of that model.model_to_model_kwargs_ranges (
Optional
[Mapping
[str
,Mapping
[str
,Any
]]]) – A mapping from model name to dictionaries of keyword argument ranges for that model to be used in HPO.model_to_loss_to_loss_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of loss name to a mapping of default keyword arguments for the instantiation of that loss function. This is useful because for some losses, have hyper-parameters such as pykeen.losses.MarginRankingLossmodel_to_loss_to_loss_kwargs_ranges (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of loss name to a mapping of keyword argument ranges for that loss to be used in HPO.model_to_optimizer_to_optimizer_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of optimizer name to a mapping of default keyword arguments for the instantiation of that optimizer. This is useful because the optimizers, have hyper-parameters such as the learning rate.model_to_optimizer_to_optimizer_kwargs_ranges (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of optimizer name to a mapping of keyword argument ranges for that optimizer to be used in HPO.model_to_regularizer_to_regularizer_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of regularizer name to a mapping of default keyword arguments for the instantiation of that regularizer. This is useful because the optimizers, have hyper-parameters such as the regularization weight.model_to_regularizer_to_regularizer_kwargs_ranges (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of regularizer name to a mapping of keyword argument ranges for that regularizer to be used in HPO.model_to_neg_sampler_to_neg_sampler_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of negative sampler name to a mapping of default keyword arguments for the instantiation of that negative sampler. This is useful because the negative samplers, have hyper-parameters such as the number of negatives that should get generated for each positive training example.model_to_neg_sampler_to_neg_sampler_kwargs_ranges (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of negative sampler name to a mapping of keyword argument ranges for that negative sampler to be used in HPO.model_to_training_loop_to_training_loop_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of training loop name to a mapping of default keyword arguments for the training loop.model_to_training_loop_to_training_kwargs (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of trainer name to a mapping of default keyword arguments for the training procedure. This is useful because you can set the hyper-parameters such as the number of training epochs and the batch size.model_to_training_loop_to_training_kwargs_ranges (
Optional
[Mapping
[str
,Mapping
[str
,Mapping
[str
,Any
]]]]) – A mapping from model name to a mapping of trainer name to a mapping of keyword argument ranges for that trainer to be used in HPO.evaluator_kwargs (
Optional
[Mapping
[str
,Any
]]) – The keyword arguments passed to the evaluator.evaluation_kwargs (
Optional
[Mapping
[str
,Any
]]) – The keyword arguments passed during evaluation.stopper_kwargs (
Optional
[Mapping
[str
,Any
]]) – The keyword arguments passed to the stopper.timeout (
Optional
[int
]) – The time (seconds) after which the ablation study will be terminated.direction (
Optional
[str
]) – Defines, whether to ‘maximize’ or ‘minimize’ the metric during HPO.sampler (
Optional
[str
]) – The HPO sampler, it defaults to random search.pruner (
Optional
[str
]) – Defines approach for pruning trials. Per default no pruning is used, i.e., pruner is set to ‘Nopruner’.metadata (
Optional
[Mapping
]) – A mapping of meta data arguments such as name of the ablation study.directory (
Union
[str
,Path
]) – The directory in which the experimental artifacts will be saved.save_artifacts (
bool
) – Defines, whether each trained model sampled during HPO should be saved.
- Return type
- Returns
pairs of output directories and HPO config paths inside those directories.
- Raises
ValueError – If the dataset is not specified correctly, i.e., dataset is not of type str, or a dictionary containing the paths to the training, testing, and validation data.