Utilities
Utilities for PyKEEN.
- class Bias(dim)[source]
A module wrapper for adding a bias.
Initialize the module.
- Parameters
dim (
int
) – >0 The dimension of the input.
- class Result[source]
A superclass of results that can be saved to a directory.
- abstract save_to_directory(directory, **kwargs)[source]
Save the results to the directory.
- Return type
- all_in_bounds(x, low=None, high=None, a_tol=0.0)[source]
Check if tensor values respect lower and upper bound.
- broadcast_upgrade_to_sequences(*xs)[source]
Apply upgrade_to_sequence to each input, and afterwards repeat singletons to match the maximum length.
- Parameters
- Return type
- Returns
a sequence of length m, where each element is a sequence and all elements have the same length.
- Raises
ValueError – if there is a non-singleton sequence input with length different from the maximum sequence length.
>>> broadcast_upgrade_to_sequences(1) ((1,),) >>> broadcast_upgrade_to_sequences(1, 2) ((1,), (2,)) >>> broadcast_upgrade_to_sequences(1, (2, 3)) ((1, 1), (2, 3))
- calculate_broadcasted_elementwise_result_shape(first, second)[source]
Determine the return shape of a broadcasted elementwise operation.
- check_shapes(*x, raise_on_errors=True)[source]
Verify that a sequence of tensors are of matching shapes.
- Parameters
x (
Tuple
[Union
[Tensor
,Tuple
[int
, …]],str
]) – A tuple (t, s), where t is a tensor, or an actual shape of a tensor (a tuple of integers), and s is a string, where each character corresponds to a (named) dimension. If the shapes of different tensors share a character, the corresponding dimensions are expected to be of equal size.raise_on_errors (
bool
) – Whether to raise an exception in case of a mismatch.
- Return type
- Returns
Whether the shapes matched.
- Raises
ValueError – If the shapes mismatch and raise_on_error is True.
Examples: >>> check_shapes(((10, 20), “bd”), ((10, 20, 20), “bdd”)) True >>> check_shapes(((10, 20), “bd”), ((10, 30, 20), “bdd”), raise_on_errors=False) False
- clamp_norm(x, maxnorm, p='fro', dim=None)[source]
Ensure that a tensor’s norm does not exceeds some threshold.
- combine_complex(x_re, x_im)[source]
Combine a complex tensor from real and imaginary part.
- Return type
FloatTensor
- compact_mapping(mapping)[source]
Update a mapping (key -> id) such that the IDs range from 0 to len(mappings) - 1.
- complex_normalize(x)[source]
Normalize a vector of complex numbers such that each element is of unit-length.
Let \(x \in \mathbb{C}^d\) denote a complex vector. Then, the operation computes
\[x_i' = \frac{x_i}{|x_i|}\]where \(|x_i| = \sqrt{Re(x_i)^2 + Im(x_i)^2}\) is the modulus of complex number
- class compose(*operations, name)[source]
A class representing the composition of several functions.
Initialize the composition with a sequence of operations.
- compute_box(base, delta, size)[source]
Compute the lower and upper corners of a resulting box.
- Parameters
base (
FloatTensor
) – shape:(*, d)
the base position (box center) of the input relation embeddingsdelta (
FloatTensor
) – shape:(*, d)
the base shape of the input relation embeddingssize (
FloatTensor
) – shape:(*, d)
the size scalar vectors of the input relation embeddings
- Return type
Tuple
[FloatTensor
,FloatTensor
]- Returns
shape:
(*, d)
each lower and upper bounds of the box whose embeddings are provided as input.
- convert_to_canonical_shape(x, dim, num=None, batch_size=1, suffix_shape=- 1)[source]
Convert a tensor to canonical shape.
- Parameters
- Return type
FloatTensor
- Returns
shape: (batch_size, num_heads, num_relations, num_tails,
*
) A tensor in canonical shape.
- create_relation_to_entity_set_mapping(triples)[source]
Create mappings from relation IDs to the set of their head / tail entities.
- ensure_ftp_directory(*, ftp, directory)[source]
Ensure the directory exists on the FTP server.
- Return type
- ensure_torch_random_state(random_state)[source]
Prepare a random state for PyTorch.
- Return type
Generator
- ensure_tuple(*x)[source]
Ensure that all elements in the sequence are upgraded to sequences.
- Parameters
x (
Union
[~X,Sequence
[~X]]) – A sequence of sequences or literals- Return type
- Returns
An upgraded sequence of sequences
>>> ensure_tuple(1, (1,), (1, 2)) ((1,), (1,), (1, 2))
- estimate_cost_of_sequence(shape, *other_shapes)[source]
Cost of a sequence of broadcasted element-wise operations of tensors, given their shapes.
- Return type
- extend_batch(batch, all_ids, dim)[source]
Extend batch for 1-to-all scoring by explicit enumeration.
- Parameters
- Return type
LongTensor
- Returns
shape: (batch_size * num_choices, 3) A large batch, where every pair from the original batch is combined with every ID.
- extended_einsum(eq, *tensors)[source]
Drop dimensions of size 1 to allow broadcasting.
- Return type
FloatTensor
- get_batchnorm_modules(module)[source]
Return all submodules which are batch normalization layers.
- Return type
List
[Module
]
- get_devices(module)[source]
Return the device(s) from each components of the model.
- Return type
Collection
[device
]
- get_dropout_modules(module)[source]
Return all submodules which are dropout layers.
- Return type
List
[Module
]
- get_expected_norm(p, d)[source]
Compute the expected value of the L_p norm.
\[E[\|x\|_p] = d^{1/p} E[|x_1|^p]^{1/p}\]under the assumption that \(x_i \sim N(0, 1)\), i.e.
\[E[|x_1|^p] = 2^{p/2} \cdot \Gamma(\frac{p+1}{2} \cdot \pi^{-1/2}\]
- get_optimal_sequence(*shapes)[source]
Find the optimal sequence in which to combine tensors elementwise based on the shapes.
- get_preferred_device(module, allow_ambiguity=True)[source]
Return the preferred device.
- Return type
device
- get_until_first_blank(s)[source]
Recapitulate all lines in the string until the first blank line.
- Return type
- invert_mapping(mapping)[source]
Invert a mapping.
- Parameters
mapping (
Mapping
[~K, ~V]) – The mapping, key -> value.- Return type
Mapping
[~V, ~K]- Returns
The inverse mapping, value -> key.
- Raises
ValueError – if the mapping is not bijective
- is_cuda_oom_error(runtime_error)[source]
Check whether the caught RuntimeError was due to CUDA being out of memory.
- Return type
- is_cudnn_error(runtime_error)[source]
Check whether the caught RuntimeError was due to a CUDNN error.
- Return type
- is_triple_tensor_subset(a, b)[source]
Check whether one tensor of triples is a subset of another one.
- Return type
- logcumsumexp(a)[source]
Compute
log(cumsum(exp(a)))
.- Parameters
a (
ndarray
) – shape: s the array- Return type
- Returns
shape s the log-cumsum-exp of the array
See also
scipy.special.logsumexp()
andtorch.logcumsumexp()
- negative_norm(x, p=2, power_norm=False)[source]
Evaluate negative norm of a vector.
- Parameters
x (
FloatTensor
) – shape: (batch_size, num_heads, num_relations, num_tails, dim) The vectors.p (
Union
[str
,int
,float
]) – The p for the norm. cf.torch.linalg.vector_norm()
.power_norm (
bool
) – Whether to return \(|x-y|_p^p\), cf. https://github.com/pytorch/pytorch/issues/28119
- Return type
FloatTensor
- Returns
shape: (batch_size, num_heads, num_relations, num_tails) The scores.
- negative_norm_of_sum(*x, p=2, power_norm=False)[source]
Evaluate negative norm of a sum of vectors on already broadcasted representations.
- Parameters
x (
FloatTensor
) – shape: (batch_size, num_heads, num_relations, num_tails, dim) The representations.p (
Union
[str
,int
,float
]) – The p for the norm. cf.torch.linalg.vector_norm()
.power_norm (
bool
) – Whether to return \(|x-y|_p^p\), cf. https://github.com/pytorch/pytorch/issues/28119
- Return type
FloatTensor
- Returns
shape: (batch_size, num_heads, num_relations, num_tails) The scores.
- point_to_box_distance(points, box_lows, box_highs)[source]
Compute the point to box distance function proposed by [abboud2020] in an element-wise fashion.
- Parameters
points (
FloatTensor
) – shape:(*, d)
the positions of the points being scored against boxesbox_lows (
FloatTensor
) – shape:(*, d)
the lower corners of the boxesbox_highs (
FloatTensor
) – shape:(*, d)
the upper corners of the boxes
- Return type
FloatTensor
- Returns
Element-wise distance function scores as per the definition above
Given points \(p\), box_lows \(l\), and box_highs \(h\), the following quantities are defined:
Width \(w\) is the difference between the upper and lower box bound: \(w = h - l\)
Box centers \(c\) are the mean of the box bounds: \(c = (h + l) / 2\)
Finally, the point to box distance \(dist(p,l,h)\) is defined as the following piecewise function:
\[\begin{split}dist(p,l,h) = \begin{cases} |p-c|/(w+1) & l <= p <+ h \\ |p-c|*(w+1) - 0.5*w*((w+1)-1/(w+1)) & otherwise \\ \end{cases}\end{split}\]
- product_normalize(x, dim=- 1)[source]
Normalize a tensor along a given dimension so that the geometric mean is 1.0.
- Parameters
x (
FloatTensor
) – shape: s An input tensordim (
int
) – the dimension along which to normalize the tensor
- Return type
FloatTensor
- Returns
shape: s An output tensor where the given dimension is normalized to have a geometric mean of 1.0.
- project_entity(e, e_p, r_p)[source]
Project entity relation-specific.
\[e_{\bot} = M_{re} e = (r_p e_p^T + I^{d_r \times d_e}) e = r_p e_p^T e + I^{d_r \times d_e} e = r_p (e_p^T e) + e'\]and additionally enforces
\[\|e_{\bot}\|_2 \leq 1\]- Parameters
e (
FloatTensor
) – shape: (…, d_e) The entity embedding.e_p (
FloatTensor
) – shape: (…, d_e) The entity projection.r_p (
FloatTensor
) – shape: (…, d_r) The relation projection.
- Return type
FloatTensor
- Returns
shape: (…, d_r)
- resolve_device(device=None)[source]
Resolve a torch.device given a desired device (string).
- Return type
device
- set_random_seed(seed)[source]
Set the random seed on numpy, torch, and python.
- Parameters
seed (
int
) – The seed that will be used innp.random.seed()
,torch.manual_seed()
, andrandom.seed()
.- Return type
- Returns
A three tuple with None, the torch generator, and None.
- split_complex(x)[source]
Split a complex tensor into real and imaginary part.
- Return type
Tuple
[FloatTensor
,FloatTensor
]
- split_list_in_batches_iter(input_list, batch_size)[source]
Split a list of instances in batches of size batch_size.
- tensor_product(*tensors)[source]
Compute element-wise product of tensors in broadcastable shape.
- Return type
FloatTensor
- tensor_sum(*tensors)[source]
Compute element-wise sum of tensors in broadcastable shape.
- Return type
FloatTensor
- unpack_singletons(*xs)[source]
Unpack sequences of length one.
- Parameters
xs (
Tuple
[~X]) – A sequence of tuples of length 1 or more- Return type
- Returns
An unpacked sequence of sequences
>>> unpack_singletons((1,), (1, 2), (1, 2, 3)) (1, (1, 2), (1, 2, 3))
- upgrade_to_sequence(x)[source]
Ensure that the input is a sequence.
Note
While strings are technically also a sequence, i.e.,
isinstance("test", typing.Sequence) is True
this may lead to unexpected behaviour when calling upgrade_to_sequence(“test”). We thus handle strings as non-sequences. To recover the other behavior, the following may be used:
upgrade_to_sequence(tuple("test"))
- Parameters
x (
Union
[~X,Sequence
[~X]]) – A literal or sequence of literals- Return type
Sequence
[~X]- Returns
If a literal was given, a one element tuple with it in it. Otherwise, return the given value.
>>> upgrade_to_sequence(1) (1,) >>> upgrade_to_sequence((1, 2, 3)) (1, 2, 3) >>> upgrade_to_sequence("test") ('test',) >>> upgrade_to_sequence(tuple("test")) ('t', 'e', 's', 't')
- view_complex(x)[source]
Convert a PyKEEN complex tensor representation into a torch one.
- Return type
- env(file=None)[source]
Print the env or output as HTML if in Jupyter.
- Param
The file to print to if not in a Jupyter setting. Defaults to sys.stdout
- Returns
A
IPython.display.HTML
if in a Jupyter notebook setting, otherwise none.
Version information for PyKEEN.
- get_git_hash(terse=True)[source]
Get the PyKEEN git hash.
- Return type
- Returns
The git hash, equals ‘UNHASHED’ if encountered CalledProcessError, signifying that the code is not installed in development mode.