Utilities
Utilities for PyKEEN.
- class Bias(dim)[source]
A module wrapper for adding a bias.
Initialize the module.
- Parameters
dim (
int
) – >0 The dimension of the input.
- class Result[source]
A superclass of results that can be saved to a directory.
- abstract save_to_directory(directory, **kwargs)[source]
Save the results to the directory.
- Return type
- all_in_bounds(x, low=None, high=None, a_tol=0.0)[source]
Check if tensor values respect lower and upper bound.
- broadcast_cat(tensors, dim)[source]
Concatenate tensors with broadcasting support.
- Parameters
tensors (
Sequence
[FloatTensor
]) – The tensors. Each of the tensors is require to have the same number of dimensions. For each dimension not equal to dim, the extent has to match the other tensors’, or be one. If it is one, the tensor is repeated to match the extent of the othe tensors.dim (
int
) – The concat dimension.
- Return type
FloatTensor
- Returns
A concatenated, broadcasted tensor.
- Raises
ValueError – if the x and y dimensions are not the same
ValueError – if broadcasting is not possible
- broadcast_upgrade_to_sequences(*xs)[source]
Apply upgrade_to_sequence to each input, and afterwards repeat singletons to match the maximum length.
- Parameters
- Return type
- Returns
a sequence of length m, where each element is a sequence and all elements have the same length.
- Raises
ValueError – if there is a non-singleton sequence input with length different from the maximum sequence length.
>>> broadcast_upgrade_to_sequences(1) ((1,),) >>> broadcast_upgrade_to_sequences(1, 2) ((1,), (2,)) >>> broadcast_upgrade_to_sequences(1, (2, 3)) ((1, 1), (2, 3))
- calculate_broadcasted_elementwise_result_shape(first, second)[source]
Determine the return shape of a broadcasted elementwise operation.
- check_shapes(*x, raise_on_errors=True)[source]
Verify that a sequence of tensors are of matching shapes.
- Parameters
x (
Tuple
[Union
[Tensor
,Tuple
[int
, …]],str
]) – A tuple (t, s), where t is a tensor, or an actual shape of a tensor (a tuple of integers), and s is a string, where each character corresponds to a (named) dimension. If the shapes of different tensors share a character, the corresponding dimensions are expected to be of equal size.raise_on_errors (
bool
) – Whether to raise an exception in case of a mismatch.
- Return type
- Returns
Whether the shapes matched.
- Raises
ValueError – If the shapes mismatch and raise_on_error is True.
Examples: >>> check_shapes(((10, 20), “bd”), ((10, 20, 20), “bdd”)) True >>> check_shapes(((10, 20), “bd”), ((10, 30, 20), “bdd”), raise_on_errors=False) False
- clamp_norm(x, maxnorm, p='fro', dim=None)[source]
Ensure that a tensor’s norm does not exceeds some threshold.
- combine_complex(x_re, x_im)[source]
Combine a complex tensor from real and imaginary part.
- Return type
FloatTensor
- compact_mapping(mapping)[source]
Update a mapping (key -> id) such that the IDs range from 0 to len(mappings) - 1.
- complex_normalize(x)[source]
Normalize a vector of complex numbers such that each element is of unit-length.
- Parameters
x (
Tensor
) – A tensor formulating complex numbers- Return type
- Returns
A normalized version accoring to the following definition.
The modulus of complex number is given as:
\[|a + ib| = \sqrt{a^2 + b^2}\]\(l_2\) norm of complex vector \(x \in \mathbb{C}^d\):
\[\|x\|^2 = \sum_{i=1}^d |x_i|^2 = \sum_{i=1}^d \left(\operatorname{Re}(x_i)^2 + \operatorname{Im}(x_i)^2\right) = \left(\sum_{i=1}^d \operatorname{Re}(x_i)^2) + (\sum_{i=1}^d \operatorname{Im}(x_i)^2\right) = \|\operatorname{Re}(x)\|^2 + \|\operatorname{Im}(x)\|^2 = \| [\operatorname{Re}(x); \operatorname{Im}(x)] \|^2\]
- class compose(*operations, name)[source]
A class representing the composition of several functions.
Initialize the composition with a sequence of operations.
- compute_box(base, delta, size)[source]
Compute the lower and upper corners of a resulting box.
- Parameters
base (
FloatTensor
) – shape:(*, d)
the base position (box center) of the input relation embeddingsdelta (
FloatTensor
) – shape:(*, d)
the base shape of the input relation embeddingssize (
FloatTensor
) – shape:(*, d)
the size scalar vectors of the input relation embeddings
- Return type
Tuple
[FloatTensor
,FloatTensor
]- Returns
shape:
(*, d)
each lower and upper bounds of the box whose embeddings are provided as input.
- convert_to_canonical_shape(x, dim, num=None, batch_size=1, suffix_shape=- 1)[source]
Convert a tensor to canonical shape.
- Parameters
- Return type
FloatTensor
- Returns
shape: (batch_size, num_heads, num_relations, num_tails,
*
) A tensor in canonical shape.
- create_relation_to_entity_set_mapping(triples)[source]
Create mappings from relation IDs to the set of their head / tail entities.
- ensure_ftp_directory(*, ftp, directory)[source]
Ensure the directory exists on the FTP server.
- Return type
- ensure_torch_random_state(random_state)[source]
Prepare a random state for PyTorch.
- Return type
Generator
- ensure_tuple(*x)[source]
Ensure that all elements in the sequence are upgraded to sequences.
- Parameters
x (
Union
[~X,Sequence
[~X]]) – A sequence of sequences or literals- Return type
- Returns
An upgraded sequence of sequences
>>> ensure_tuple(1, (1,), (1, 2)) ((1,), (1,), (1, 2))
- estimate_cost_of_sequence(shape, *other_shapes)[source]
Cost of a sequence of broadcasted element-wise operations of tensors, given their shapes.
- Return type
- extend_batch(batch, all_ids, dim)[source]
Extend batch for 1-to-all scoring by explicit enumeration.
- Parameters
- Return type
LongTensor
- Returns
shape: (batch_size * num_choices, 3) A large batch, where every pair from the original batch is combined with every ID.
- extended_einsum(eq, *tensors)[source]
Drop dimensions of size 1 to allow broadcasting.
- Return type
FloatTensor
- get_batchnorm_modules(module)[source]
Return all submodules which are batch normalization layers.
- Return type
List
[Module
]
- get_devices(module)[source]
Return the device(s) from each components of the model.
- Return type
Collection
[device
]
- get_dropout_modules(module)[source]
Return all submodules which are dropout layers.
- Return type
List
[Module
]
- get_expected_norm(p, d)[source]
Compute the expected value of the L_p norm.
\[E[\|x\|_p] = d^{1/p} E[|x_1|^p]^{1/p}\]under the assumption that \(x_i \sim N(0, 1)\), i.e.
\[E[|x_1|^p] = 2^{p/2} \cdot \Gamma(\frac{p+1}{2} \cdot \pi^{-1/2}\]
- get_optimal_sequence(*shapes)[source]
Find the optimal sequence in which to combine tensors elementwise based on the shapes.
- get_preferred_device(module, allow_ambiguity=True)[source]
Return the preferred device.
- Return type
device
- get_until_first_blank(s)[source]
Recapitulate all lines in the string until the first blank line.
- Return type
- invert_mapping(mapping)[source]
Invert a mapping.
- Parameters
mapping (
Mapping
[~K, ~V]) – The mapping, key -> value.- Return type
Mapping
[~V, ~K]- Returns
The inverse mapping, value -> key.
- Raises
ValueError – if the mapping is not bijective
- is_cuda_oom_error(runtime_error)[source]
Check whether the caught RuntimeError was due to CUDA being out of memory.
- Return type
- is_cudnn_error(runtime_error)[source]
Check whether the caught RuntimeError was due to a CUDNN error.
- Return type
- is_triple_tensor_subset(a, b)[source]
Check whether one tensor of triples is a subset of another one.
- Return type
- logcumsumexp(a)[source]
Compute
log(cumsum(exp(a)))
.- Parameters
a (
ndarray
) – shape: s the array- Return type
- Returns
shape s the log-cumsum-exp of the array
See also
scipy.special.logsumexp()
andtorch.logcumsumexp()
- negative_norm(x, p=2, power_norm=False)[source]
Evaluate negative norm of a vector.
- Parameters
x (
FloatTensor
) – shape: (batch_size, num_heads, num_relations, num_tails, dim) The vectors.p (
Union
[str
,int
,float
]) – The p for the norm. cf.torch.linalg.vector_norm()
.power_norm (
bool
) – Whether to return \(|x-y|_p^p\), cf. https://github.com/pytorch/pytorch/issues/28119
- Return type
FloatTensor
- Returns
shape: (batch_size, num_heads, num_relations, num_tails) The scores.
- negative_norm_of_sum(*x, p=2, power_norm=False)[source]
Evaluate negative norm of a sum of vectors on already broadcasted representations.
- Parameters
x (
FloatTensor
) – shape: (batch_size, num_heads, num_relations, num_tails, dim) The representations.p (
Union
[str
,int
,float
]) – The p for the norm. cf.torch.linalg.vector_norm()
.power_norm (
bool
) – Whether to return \(|x-y|_p^p\), cf. https://github.com/pytorch/pytorch/issues/28119
- Return type
FloatTensor
- Returns
shape: (batch_size, num_heads, num_relations, num_tails) The scores.
- point_to_box_distance(points, box_lows, box_highs)[source]
Compute the point to box distance function proposed by [abboud2020] in an element-wise fashion.
- Parameters
points (
FloatTensor
) – shape:(*, d)
the positions of the points being scored against boxesbox_lows (
FloatTensor
) – shape:(*, d)
the lower corners of the boxesbox_highs (
FloatTensor
) – shape:(*, d)
the upper corners of the boxes
- Return type
FloatTensor
- Returns
Element-wise distance function scores as per the definition above
Given points \(p\), box_lows \(l\), and box_highs \(h\), the following quantities are defined:
Width \(w\) is the difference between the upper and lower box bound: \(w = h - l\)
Box centers \(c\) are the mean of the box bounds: \(c = (h + l) / 2\)
Finally, the point to box distance \(dist(p,l,h)\) is defined as the following piecewise function:
\[\begin{split}dist(p,l,h) = \begin{cases} |p-c|/(w+1) & l <= p <+ h \\ |p-c|*(w+1) - 0.5*w*((w+1)-1/(w+1)) & otherwise \\ \end{cases}\end{split}\]
- product_normalize(x, dim=- 1)[source]
Normalize a tensor along a given dimension so that the geometric mean is 1.0.
- Parameters
x (
FloatTensor
) – shape: s An input tensordim (
int
) – the dimension along which to normalize the tensor
- Return type
FloatTensor
- Returns
shape: s An output tensor where the given dimension is normalized to have a geometric mean of 1.0.
- project_entity(e, e_p, r_p)[source]
Project entity relation-specific.
\[e_{\bot} = M_{re} e = (r_p e_p^T + I^{d_r \times d_e}) e = r_p e_p^T e + I^{d_r \times d_e} e = r_p (e_p^T e) + e'\]and additionally enforces
\[\|e_{\bot}\|_2 \leq 1\]- Parameters
e (
FloatTensor
) – shape: (…, d_e) The entity embedding.e_p (
FloatTensor
) – shape: (…, d_e) The entity projection.r_p (
FloatTensor
) – shape: (…, d_r) The relation projection.
- Return type
FloatTensor
- Returns
shape: (…, d_r)
- resolve_device(device=None)[source]
Resolve a torch.device given a desired device (string).
- Return type
device
- set_random_seed(seed)[source]
Set the random seed on numpy, torch, and python.
- Parameters
seed (
int
) – The seed that will be used innp.random.seed()
,torch.manual_seed()
, andrandom.seed()
.- Return type
- Returns
A three tuple with None, the torch generator, and None.
- split_complex(x)[source]
Split a complex tensor into real and imaginary part.
- Return type
Tuple
[FloatTensor
,FloatTensor
]
- split_list_in_batches_iter(input_list, batch_size)[source]
Split a list of instances in batches of size batch_size.
- tensor_product(*tensors)[source]
Compute element-wise product of tensors in broadcastable shape.
- Return type
FloatTensor
- tensor_sum(*tensors)[source]
Compute element-wise sum of tensors in broadcastable shape.
- Return type
FloatTensor
- unpack_singletons(*xs)[source]
Unpack sequences of length one.
- Parameters
xs (
Tuple
[~X]) – A sequence of tuples of length 1 or more- Return type
- Returns
An unpacked sequence of sequences
>>> unpack_singletons((1,), (1, 2), (1, 2, 3)) (1, (1, 2), (1, 2, 3))
- upgrade_to_sequence(x)[source]
Ensure that the input is a sequence.
Note
While strings are technically also a sequence, i.e.,
isinstance("test", typing.Sequence) is True
this may lead to unexpected behaviour when calling upgrade_to_sequence(“test”). We thus handle strings as non-sequences. To recover the other behavior, the following may be used:
upgrade_to_sequence(tuple("test"))
- Parameters
x (
Union
[~X,Sequence
[~X]]) – A literal or sequence of literals- Return type
Sequence
[~X]- Returns
If a literal was given, a one element tuple with it in it. Otherwise, return the given value.
>>> upgrade_to_sequence(1) (1,) >>> upgrade_to_sequence((1, 2, 3)) (1, 2, 3) >>> upgrade_to_sequence("test") ('test',) >>> upgrade_to_sequence(tuple("test")) ('t', 'e', 's', 't')
- view_complex(x)[source]
Convert a PyKEEN complex tensor representation into a torch one.
- Return type
- env(file=None)[source]
Print the env or output as HTML if in Jupyter.
- Param
The file to print to if not in a Jupyter setting. Defaults to sys.stdout
- Returns
A
IPython.display.HTML
if in a Jupyter notebook setting, otherwise none.
Version information for PyKEEN.
- get_git_hash(terse=True)[source]
Get the PyKEEN git hash.
- Return type
- Returns
The git hash, equals ‘UNHASHED’ if encountered CalledProcessError, signifying that the code is not installed in development mode.