Utilities¶

Utilities for PyKEEN.

class NoRandomSeedNecessary[source]¶: Used in pipeline when random seed is set automatically.

class Result[source]¶

A superclass of results that can be saved to a directory.

abstract save_to_directory(directory, **kwargs)[source]¶

Save the results to the directory.

Return type: None

abstract save_to_ftp(directory, ftp)[source]¶

Save the results to the directory in an FTP server.

Return type: None

abstract save_to_s3(directory, bucket, s3=None)[source]¶

Save all artifacts to the given directory in an S3 Bucket.

Parameters

directory (str) – The directory in the S3 bucket
bucket (str) – The name of the S3 bucket
s3 – A client from boto3.client(), if already instantiated

Return type

None

all_in_bounds(x, low=None, high=None, a_tol=0.0)[source]¶

Check if tensor values respect lower and upper bound.

Parameters

x (Tensor) – The tensor.
low (Optional[float]) – The lower bound.
high (Optional[float]) – The upper bound.
a_tol (float) – Absolute tolerance.

Return type

bool

Returns

If all values are within the given bounds

broadcast_cat(x, y, dim)[source]¶

Concatenate with broadcasting.

Parameters

x (FloatTensor) – The first tensor.
y (FloatTensor) – The second tensor.
dim (int) – The concat dimension.

Return type

FloatTensor

Returns

A concatenated, broadcasted

Raises

ValueError – if the x and y dimensions are not the same
ValueError – if broadcasting is not possible

calculate_broadcasted_elementwise_result_shape(first, second)[source]¶

Determine the return shape of a broadcasted elementwise operation.

Return type: Tuple[int, …]

check_shapes(*x, raise_on_errors=True)[source]¶

Verify that a sequence of tensors are of matching shapes.

Parameters

x (Tuple[Union[Tensor, Tuple[int, …]], str]) – A tuple (t, s), where t is a tensor, or an actual shape of a tensor (a tuple of integers), and s is a string, where each character corresponds to a (named) dimension. If the shapes of different tensors share a character, the corresponding dimensions are expected to be of equal size.
raise_on_errors (bool) – Whether to raise an exception in case of a mismatch.

Return type

bool

Returns

Whether the shapes matched.

Raises

ValueError – If the shapes mismatch and raise_on_error is True.

Examples: >>> check_shapes(((10, 20), “bd”), ((10, 20, 20), “bdd”)) True >>> check_shapes(((10, 20), “bd”), ((10, 30, 20), “bdd”), raise_on_errors=False) False

clamp_norm(x, maxnorm, p='fro', dim=None, eps=1e-08)[source]¶

Ensure that a tensor’s norm does not exceeds some threshold.

Parameters

x (Tensor) – The vector.
maxnorm (float) – The maximum norm (>0).
p (Union[str, int]) – The norm type.
dim (Union[None, int, Iterable[int]]) – The dimension(s).
eps (float) – A small value to avoid division by zero.

Return type

Tensor

Returns

A vector with \(|x| <= maxnorm\).

combine_complex(x_re, x_im)[source]¶

Combine a complex tensor from real and imaginary part.

Return type: FloatTensor

compact_mapping(mapping)[source]¶

Update a mapping (key -> id) such that the IDs range from 0 to len(mappings) - 1.

Parameters: mapping (Mapping[~X, int]) – The mapping to compact.
Return type: Tuple[Mapping[~X, int], Mapping[int, int]]
Returns: A pair (translated, translation) where translated is the updated mapping, and translation a dictionary from old to new ids.

class compose(*operations)[source]¶

A class representing the composition of several functions.

Initialize the composition with a sequence of operations.

Parameters: operations (Callable[[~X], ~X]) – unary operations that will be applied in succession

convert_to_canonical_shape(x, dim, num=None, batch_size=1, suffix_shape=- 1)[source]¶

Convert a tensor to canonical shape.

Parameters

x (FloatTensor) – The tensor in compatible shape.
dim (Union[int, str]) – The “num” dimension.
batch_size (int) – The batch size.
num (Optional[int]) – The number.
suffix_shape (Union[int, Sequence[int]]) – The suffix shape.

Return type

FloatTensor

Returns

shape: (batch_size, num_heads, num_relations, num_tails, *) A tensor in canonical shape.

ensure_ftp_directory(*, ftp, directory)[source]¶

Ensure the directory exists on the FTP server.

Return type: None

ensure_torch_random_state(random_state)[source]¶

Prepare a random state for PyTorch.

Return type: Generator

ensure_tuple(*x)[source]¶

Ensure that all elements in the sequence are upgraded to sequences.

Parameters: x (Union[~X, Sequence[~X]]) – A sequence of sequences or literals
Return type: Sequence[Sequence[~X]]
Returns: An upgraded sequence of sequences

>>> ensure_tuple(1, (1,), (1, 2))
((1,), (1,), (1, 2))

estimate_cost_of_sequence(shape, *other_shapes)[source]¶

Cost of a sequence of broadcasted element-wise operations of tensors, given their shapes.

Return type: int

extend_batch(batch, all_ids, dim)[source]¶

Extend batch for 1-to-all scoring by explicit enumeration.

Parameters

batch (LongTensor) – shape: (batch_size, 2) The batch.
all_ids (List[int]) – len: num_choices The IDs to enumerate.
dim (int) – in {0,1,2} The column along which to insert the enumerated IDs.

Return type

LongTensor

Returns

shape: (batch_size * num_choices, 3) A large batch, where every pair from the original batch is combined with every ID.

extended_einsum(eq, *tensors)[source]¶

Drop dimensions of size 1 to allow broadcasting.

Return type: FloatTensor

fix_dataclass_init_docs(cls)[source]¶

Fix the __init__ documentation for a dataclasses.dataclass.

Parameters: cls (Type) – The class whose docstring needs fixing
Return type: Type
Returns: The class that was passed so this function can be used as a decorator

get_json_bytes_io(obj)[source]¶

Get the JSON as bytes.

Return type: BytesIO

get_model_io(model)[source]¶

Get the model as bytes.

Return type: BytesIO

get_optimal_sequence(*shapes)[source]¶

Find the optimal sequence in which to combine tensors elementwise based on the shapes.

Parameters: shapes (Tuple[int, …]) – The shapes of the tensors to combine.
Return type: Tuple[int, Tuple[int, …]]
Returns: The optimal execution order (as indices), and the cost.

get_until_first_blank(s)[source]¶

Recapitulate all lines in the string until the first blank line.

Return type: str

invert_mapping(mapping)[source]¶

Invert a mapping.

Parameters: mapping (Mapping[~K, ~V]) – The mapping, key -> value.
Return type: Mapping[~V, ~K]
Returns: The inverse mapping, value -> key.
Raises: ValueError – if the mapping is not bijective

is_cuda_oom_error(runtime_error)[source]¶

Check whether the caught RuntimeError was due to CUDA being out of memory.

Return type: bool

is_cudnn_error(runtime_error)[source]¶

Check whether the caught RuntimeError was due to a CUDNN error.

Return type: bool

negative_norm(x, p=2, power_norm=False)[source]¶

Evaluate negative norm of a vector.

Parameters

x (FloatTensor) – shape: (batch_size, num_heads, num_relations, num_tails, dim) The vectors.
p (Union[str, int, float]) – The p for the norm. cf. torch.norm.
power_norm (bool) – Whether to return \(|x-y|_p^p\), cf. https://github.com/pytorch/pytorch/issues/28119

Return type

FloatTensor

Returns

shape: (batch_size, num_heads, num_relations, num_tails) The scores.

negative_norm_of_sum(*x, p=2, power_norm=False)[source]¶

Evaluate negative norm of a sum of vectors on already broadcasted representations.

Parameters

x (FloatTensor) – shape: (batch_size, num_heads, num_relations, num_tails, dim) The representations.
p (Union[str, int, float]) – The p for the norm. cf. torch.norm.
power_norm (bool) – Whether to return \(|x-y|_p^p\), cf. https://github.com/pytorch/pytorch/issues/28119

Return type

FloatTensor

Returns

shape: (batch_size, num_heads, num_relations, num_tails) The scores.

normalize_string(s, *, suffix=None)[source]¶

Normalize a string for lookup.

Return type: str

project_entity(e, e_p, r_p)[source]¶

Project entity relation-specific.

\[e_{\bot} = M_{re} e = (r_p e_p^T + I^{d_r \times d_e}) e = r_p e_p^T e + I^{d_r \times d_e} e = r_p (e_p^T e) + e'\]

and additionally enforces

\[\|e_{\bot}\|_2 \leq 1\]

Parameters

e (FloatTensor) – shape: (…, d_e) The entity embedding.
e_p (FloatTensor) – shape: (…, d_e) The entity projection.
r_p (FloatTensor) – shape: (…, d_r) The relation projection.

Return type

FloatTensor

Returns

shape: (…, d_r)

random_non_negative_int()[source]¶

Generate a random positive integer.

Return type: int

resolve_device(device=None)[source]¶

Resolve a torch.device given a desired device (string).

Return type: device

set_random_seed(seed)[source]¶

Set the random seed on numpy, torch, and python.

Parameters: seed (int) – The seed that will be used in np.random.seed(), torch.manual_seed(), and random.seed().
Return type: Tuple[None, Generator, None]
Returns: A three tuple with None, the torch generator, and None.

split_complex(x)[source]¶

Split a complex tensor into real and imaginary part.

Return type: Tuple[FloatTensor, FloatTensor]

split_list_in_batches_iter(input_list, batch_size)[source]¶

Split a list of instances in batches of size batch_size.

Return type: Iterable[List[~X]]

strip_dim(*tensors, n=4)[source]¶

Strip the first dimensions.

Parameters

tensors (FloatTensor) – The tensors whose first n dimensions should be independently stripped
n (int) – The number of initial dimensions to strip

Return type

Sequence[FloatTensor]

Returns

A tuple of the reduced tensors

tensor_product(*tensors)[source]¶

Compute element-wise product of tensors in broadcastable shape.

Return type: FloatTensor

tensor_sum(*tensors)[source]¶

Compute element-wise sum of tensors in broadcastable shape.

Return type: FloatTensor

torch_is_in_1d(query_tensor, test_tensor, max_id=None, invert=False)[source]¶

Return a boolean mask with Q[i] in T.

The method guarantees memory complexity of max(size(Q), size(T)) and is thus, memory-wise, superior to naive broadcasting.

Parameters

query_tensor (LongTensor) – shape: S The query Q.
test_tensor (Union[Collection[int], LongTensor]) – The test set T.
max_id (Optional[int]) – A maximum ID. If not given, will be inferred.
invert (bool) – Whether to invert the result.

Return type

BoolTensor

Returns

shape: S A boolean mask.

unpack_singletons(*xs)[source]¶

Unpack sequences of length one.

Parameters: xs (Tuple[~X]) – A sequence of tuples of length 1 or more
Return type: Sequence[Union[~X, Tuple[~X]]]
Returns: An unpacked sequence of sequences

>>> unpack_singletons((1,), (1, 2), (1, 2, 3))
(1, (1, 2), (1, 2, 3))

upgrade_to_sequence(x)[source]¶

Ensure that the input is a sequence.

Parameters: x (Union[~X, Sequence[~X]]) – A literal or sequence of literals
Return type: Sequence[~X]
Returns: If a literal was given, a one element tuple with it in it. Otherwise, return the given value.

>>> upgrade_to_sequence(1)
(1,)
>>> upgrade_to_sequence((1, 2, 3))
(1, 2, 3)

view_complex(x)[source]¶

Convert a PyKEEN complex tensor representation into a torch one.

Return type: Tensor