Utilities

Utilities for neural network components.

exception ShapeError(shape, reference)[source]

An error for a mismatch in shapes.

Initialize the error.

Parameters
Return type

None

classmethod verify(shape, reference)[source]

Raise an exception if the shape does not match the reference.

This method normalizes the shapes first.

Parameters
Raises

ShapeError – if the two shapes do not match.

Return type

Sequence[int]

Returns

the normalized shape

class WikidataCache[source]

A cache for requests against Wikidata’s SPARQL endpoint.

Initialize the cache.

WIKIDATA_ENDPOINT = 'https://query.wikidata.org/bigdata/namespace/wdq/sparql'

Wikidata SPARQL endpoint. See https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service#Interfacing

get_descriptions(ids)[source]

Get entity descriptions for the given IDs.

Parameters

ids (Sequence[str]) – the Wikidata IDs

Return type

Sequence[str]

Returns

the description for each Wikidata entity

get_image_paths(ids, extensions=('jpeg', 'jpg', 'gif', 'png', 'svg', 'tif'), progress=False)[source]

Get paths to images for the given IDs.

Parameters
  • ids (Sequence[str]) – the Wikidata IDs.

  • extensions (Collection[str]) – the allowed file extensions

  • progress (bool) – whether to display a progress bar

Return type

Sequence[Optional[Path]]

Returns

the paths to images for the given IDs.

get_labels(ids)[source]

Get entity labels for the given IDs.

Parameters

ids (Sequence[str]) – the Wikidata IDs

Return type

Sequence[str]

Returns

the label for each Wikidata entity

classmethod query(sparql, wikidata_ids, batch_size=256)[source]

Batched SPARQL query execution for the given IDS.

Parameters
  • sparql (Union[str, Callable[…, str]]) – the SPARQL query with a placeholder ids

  • wikidata_ids (Sequence[str]) – the Wikidata IDs

  • batch_size (int) – the batch size, i.e., maximum number of IDs per query

Return type

Iterable[Mapping[str, Any]]

Returns

an iterable over JSON results, where the keys correspond to query variables, and the values to the corresponding binding

classmethod query_text(wikidata_ids, language='en', batch_size=256)[source]

Query the SPARQL endpoints about information for the given IDs.

Parameters
  • wikidata_ids (Sequence[str]) – the Wikidata IDs

  • language (str) – the label language

  • batch_size (int) – the batch size; if more ids are provided, break the big request into multiple smaller ones

Return type

Mapping[str, Mapping[str, str]]

Returns

a mapping from Wikidata Ids to dictionaries with the label and description of the entities

static verify_ids(ids)[source]

Raise error if invalid IDs are encountered.

Parameters

ids (Sequence[str]) – the ids to verify

Raises

ValueError – if any invalid ID is encountered

adjacency_tensor_to_stacked_matrix(num_relations, num_entities, source, target, edge_type, edge_weights=None, horizontal=True)[source]

Stack adjacency matrices as described in [thanapalasingam2021].

This method re-arranges the (sparse) adjacency tensor of shape (num_entities, num_relations, num_entities) to a sparse adjacency matrix of shape (num_entities, num_relations * num_entities) (horizontal stacking) or (num_entities * num_relations, num_entities) (vertical stacking). Thereby, we can perform the relation-specific message passing of R-GCN by a single sparse matrix multiplication (and some additional pre- and/or post-processing) of the inputs.

Parameters
  • num_relations (int) – the number of relations

  • num_entities (int) – the number of entities

  • source (LongTensor) – shape: (num_triples,) the source entity indices

  • target (LongTensor) – shape: (num_triples,) the target entity indices

  • edge_type (LongTensor) – shape: (num_triples,) the edge type, i.e., relation ID

  • edge_weights (Optional[FloatTensor]) – shape: (num_triples,) scalar edge weights

  • horizontal (bool) – whether to use horizontal or vertical stacking

Return type

Tensor

Returns

shape: (num_entities * num_relations, num_entities) or (num_entities, num_entities * num_relations) the stacked adjacency matrix

safe_diagonal(matrix)[source]

Extract diagonal from a potentially sparse matrix.

Note

this is a work-around as long as torch.diagonal() does not work for sparse tensors

Parameters

matrix (Tensor) – shape: (n, n) the matrix

Return type

Tensor

Returns

shape: (n,) the diagonal values.

use_horizontal_stacking(input_dim, output_dim)[source]

Determine a stacking direction based on the input and output dimension.

The vertical stacking approach is suitable for low dimensional input and high dimensional output, because the projection to low dimensions is done first. While the horizontal stacking approach is good for high dimensional input and low dimensional output as the projection to high dimension is done last.

Parameters
  • input_dim (int) – the layer’s input dimension

  • output_dim (int) – the layer’s output dimension

Return type

bool

Returns

whether to use horizontal (True) or vertical stacking