Utilities

Utilities for PyKEEN.

class NoRandomSeedNecessary[source]

Used in pipeline when random seed is set automatically.

class Result[source]

A superclass of results that can be saved to a directory.

abstract save_to_directory(directory, **kwargs)[source]

Save the results to the directory.

Return type

None

abstract save_to_ftp(directory, ftp)[source]

Save the results to the directory in an FTP server.

Return type

None

abstract save_to_s3(directory, bucket, s3=None)[source]

Save all artifacts to the given directory in an S3 Bucket.

Parameters
  • directory (str) – The directory in the S3 bucket

  • bucket (str) – The name of the S3 bucket

  • s3 – A client from boto3.client(), if already instantiated

Return type

None

all_in_bounds(x, low=None, high=None, a_tol=0.0)[source]

Check if tensor values respect lower and upper bound.

Parameters
Return type

bool

Returns

If all values are within the given bounds

broadcast_cat(x, y, dim)[source]

Concatenate with broadcasting.

Parameters
  • x (FloatTensor) – The first tensor.

  • y (FloatTensor) – The second tensor.

  • dim (int) – The concat dimension.

Return type

FloatTensor

Returns

A concatenated, broadcasted

Raises
  • ValueError – if the x and y dimensions are not the same

  • ValueError – if broadcasting is not possible

calculate_broadcasted_elementwise_result_shape(first, second)[source]

Determine the return shape of a broadcasted elementwise operation.

Return type

Tuple[int, …]

check_shapes(*x, raise_on_errors=True)[source]

Verify that a sequence of tensors are of matching shapes.

Parameters
  • x (Tuple[Union[Tensor, Tuple[int, …]], str]) – A tuple (t, s), where t is a tensor, or an actual shape of a tensor (a tuple of integers), and s is a string, where each character corresponds to a (named) dimension. If the shapes of different tensors share a character, the corresponding dimensions are expected to be of equal size.

  • raise_on_errors (bool) – Whether to raise an exception in case of a mismatch.

Return type

bool

Returns

Whether the shapes matched.

Raises

ValueError – If the shapes mismatch and raise_on_error is True.

Examples: >>> check_shapes(((10, 20), “bd”), ((10, 20, 20), “bdd”)) True >>> check_shapes(((10, 20), “bd”), ((10, 30, 20), “bdd”), raise_on_errors=False) False

clamp_norm(x, maxnorm, p='fro', dim=None, eps=1e-08)[source]

Ensure that a tensor’s norm does not exceeds some threshold.

Parameters
Return type

Tensor

Returns

A vector with \(|x| <= maxnorm\).

combine_complex(x_re, x_im)[source]

Combine a complex tensor from real and imaginary part.

Return type

FloatTensor

compact_mapping(mapping)[source]

Update a mapping (key -> id) such that the IDs range from 0 to len(mappings) - 1.

Parameters

mapping (Mapping[~X, int]) – The mapping to compact.

Return type

Tuple[Mapping[~X, int], Mapping[int, int]]

Returns

A pair (translated, translation) where translated is the updated mapping, and translation a dictionary from old to new ids.

class compose(*operations)[source]

A class representing the composition of several functions.

Initialize the composition with a sequence of operations.

Parameters

operations (Callable[[~X], ~X]) – unary operations that will be applied in succession

convert_to_canonical_shape(x, dim, num=None, batch_size=1, suffix_shape=- 1)[source]

Convert a tensor to canonical shape.

Parameters
  • x (FloatTensor) – The tensor in compatible shape.

  • dim (Union[int, str]) – The “num” dimension.

  • batch_size (int) – The batch size.

  • num (Optional[int]) – The number.

  • suffix_shape (Union[int, Sequence[int]]) – The suffix shape.

Return type

FloatTensor

Returns

shape: (batch_size, num_heads, num_relations, num_tails, *) A tensor in canonical shape.

ensure_ftp_directory(*, ftp, directory)[source]

Ensure the directory exists on the FTP server.

Return type

None

ensure_torch_random_state(random_state)[source]

Prepare a random state for PyTorch.

Return type

Generator

ensure_tuple(*x)[source]

Ensure that all elements in the sequence are upgraded to sequences.

Parameters

x (Union[~X, Sequence[~X]]) – A sequence of sequences or literals

Return type

Sequence[Sequence[~X]]

Returns

An upgraded sequence of sequences

>>> ensure_tuple(1, (1,), (1, 2))
((1,), (1,), (1, 2))
estimate_cost_of_sequence(shape, *other_shapes)[source]

Cost of a sequence of broadcasted element-wise operations of tensors, given their shapes.

Return type

int

extend_batch(batch, all_ids, dim)[source]

Extend batch for 1-to-all scoring by explicit enumeration.

Parameters
  • batch (LongTensor) – shape: (batch_size, 2) The batch.

  • all_ids (List[int]) – len: num_choices The IDs to enumerate.

  • dim (int) – in {0,1,2} The column along which to insert the enumerated IDs.

Return type

LongTensor

Returns

shape: (batch_size * num_choices, 3) A large batch, where every pair from the original batch is combined with every ID.

extended_einsum(eq, *tensors)[source]

Drop dimensions of size 1 to allow broadcasting.

Return type

FloatTensor

fix_dataclass_init_docs(cls)[source]

Fix the __init__ documentation for a dataclasses.dataclass.

Parameters

cls (Type) – The class whose docstring needs fixing

Return type

Type

Returns

The class that was passed so this function can be used as a decorator

flatten_dictionary(dictionary, prefix=None, sep='.')[source]

Flatten a nested dictionary.

Return type

Dict[str, Any]

format_relative_comparison(part, total)[source]

Format a relative comparison.

Return type

str

get_batchnorm_modules(module)[source]

Return all submodules which are batch normalization layers.

Return type

List[Module]

get_benchmark(name)[source]

Get the benchmark directory for this version.

Return type

Path

get_df_io(df)[source]

Get the dataframe as bytes.

Return type

BytesIO

get_expected_norm(p, d)[source]

Compute the expected value of the L_p norm.

\[E[\|x\|_p] = d^{1/p} E[|x_1|^p]^{1/p}\]

under the assumption that \(x_i \sim N(0, 1)\), i.e.

\[E[|x_1|^p] = 2^{p/2} \cdot \Gamma(\frac{p+1}{2} \cdot \pi^{-1/2}\]
Parameters
  • p (Union[int, float, str]) – The parameter p of the norm.

  • d (int) – The dimension of the vector.

Return type

float

Returns

The expected value.

Raises
get_json_bytes_io(obj)[source]

Get the JSON as bytes.

Return type

BytesIO

get_model_io(model)[source]

Get the model as bytes.

Return type

BytesIO

get_optimal_sequence(*shapes)[source]

Find the optimal sequence in which to combine tensors elementwise based on the shapes.

Parameters

shapes (Tuple[int, …]) – The shapes of the tensors to combine.

Return type

Tuple[int, Tuple[int, …]]

Returns

The optimal execution order (as indices), and the cost.

get_until_first_blank(s)[source]

Recapitulate all lines in the string until the first blank line.

Return type

str

invert_mapping(mapping)[source]

Invert a mapping.

Parameters

mapping (Mapping[~K, ~V]) – The mapping, key -> value.

Return type

Mapping[~V, ~K]

Returns

The inverse mapping, value -> key.

Raises

ValueError – if the mapping is not bijective

is_cuda_oom_error(runtime_error)[source]

Check whether the caught RuntimeError was due to CUDA being out of memory.

Return type

bool

is_cudnn_error(runtime_error)[source]

Check whether the caught RuntimeError was due to a CUDNN error.

Return type

bool

negative_norm(x, p=2, power_norm=False)[source]

Evaluate negative norm of a vector.

Parameters
Return type

FloatTensor

Returns

shape: (batch_size, num_heads, num_relations, num_tails) The scores.

negative_norm_of_sum(*x, p=2, power_norm=False)[source]

Evaluate negative norm of a sum of vectors on already broadcasted representations.

Parameters
Return type

FloatTensor

Returns

shape: (batch_size, num_heads, num_relations, num_tails) The scores.

normalize_string(s, *, suffix=None)[source]

Normalize a string for lookup.

Return type

str

project_entity(e, e_p, r_p)[source]

Project entity relation-specific.

\[e_{\bot} = M_{re} e = (r_p e_p^T + I^{d_r \times d_e}) e = r_p e_p^T e + I^{d_r \times d_e} e = r_p (e_p^T e) + e'\]

and additionally enforces

\[\|e_{\bot}\|_2 \leq 1\]
Parameters
  • e (FloatTensor) – shape: (…, d_e) The entity embedding.

  • e_p (FloatTensor) – shape: (…, d_e) The entity projection.

  • r_p (FloatTensor) – shape: (…, d_r) The relation projection.

Return type

FloatTensor

Returns

shape: (…, d_r)

random_non_negative_int()[source]

Generate a random positive integer.

Return type

int

resolve_device(device=None)[source]

Resolve a torch.device given a desired device (string).

Return type

device

set_random_seed(seed)[source]

Set the random seed on numpy, torch, and python.

Parameters

seed (int) – The seed that will be used in np.random.seed(), torch.manual_seed(), and random.seed().

Return type

Tuple[None, Generator, None]

Returns

A three tuple with None, the torch generator, and None.

split_complex(x)[source]

Split a complex tensor into real and imaginary part.

Return type

Tuple[FloatTensor, FloatTensor]

split_list_in_batches_iter(input_list, batch_size)[source]

Split a list of instances in batches of size batch_size.

Return type

Iterable[List[~X]]

strip_dim(*tensors, n=4)[source]

Strip the first dimensions.

Parameters
  • tensors (FloatTensor) – The tensors whose first n dimensions should be independently stripped

  • n (int) – The number of initial dimensions to strip

Return type

Sequence[FloatTensor]

Returns

A tuple of the reduced tensors

tensor_product(*tensors)[source]

Compute element-wise product of tensors in broadcastable shape.

Return type

FloatTensor

tensor_sum(*tensors)[source]

Compute element-wise sum of tensors in broadcastable shape.

Return type

FloatTensor

torch_is_in_1d(query_tensor, test_tensor, max_id=None, invert=False)[source]

Return a boolean mask with Q[i] in T.

The method guarantees memory complexity of max(size(Q), size(T)) and is thus, memory-wise, superior to naive broadcasting.

Parameters
  • query_tensor (LongTensor) – shape: S The query Q.

  • test_tensor (Union[Collection[int], LongTensor]) – The test set T.

  • max_id (Optional[int]) – A maximum ID. If not given, will be inferred.

  • invert (bool) – Whether to invert the result.

Return type

BoolTensor

Returns

shape: S A boolean mask.

unpack_singletons(*xs)[source]

Unpack sequences of length one.

Parameters

xs (Tuple[~X]) – A sequence of tuples of length 1 or more

Return type

Sequence[Union[~X, Tuple[~X]]]

Returns

An unpacked sequence of sequences

>>> unpack_singletons((1,), (1, 2), (1, 2, 3))
(1, (1, 2), (1, 2, 3))
upgrade_to_sequence(x)[source]

Ensure that the input is a sequence.

Parameters

x (Union[~X, Sequence[~X]]) – A literal or sequence of literals

Return type

Sequence[~X]

Returns

If a literal was given, a one element tuple with it in it. Otherwise, return the given value.

>>> upgrade_to_sequence(1)
(1,)
>>> upgrade_to_sequence((1, 2, 3))
(1, 2, 3)
view_complex(x)[source]

Convert a PyKEEN complex tensor representation into a torch one.

Return type

Tensor