NodePiece

pykeen.nn.node_piece Package

NodePiece modules.

A NodePieceRepresentation contains a collection of TokenizationRepresentation. A TokenizationRepresentation is defined as Representation module mapping token indices to representations, also called the vocabulary in resemblance of token representations known from NLP applications, and an assignment from entities to (multiple) tokens.

In order to obtain the vocabulary and assignment, multiple options are available, which often follow a two-step approach of first selecting a vocabulary, and afterwards assigning the entities to the set of tokens, usually using the graph structure of the KG.

One way of tokenization, is tokenization by AnchorTokenizer, which selects some anchor entities from the graph as vocabulary. The anchor selection process is controlled by an AnchorSelection instance. In order to obtain the assignment, some measure of graph distance is used. To this end, a AnchorSearcher instance calculates the closest anchor entities from the vocabulary for each of the entities in the graph.

Since some tokenizations are expensive to compute, we offer a mechanism to use precomputed tokenizations via PrecomputedPoolTokenizer. To enable loading from different formats, a loader subclassing from PrecomputedTokenizerLoader can be selected accordingly. To precompute anchor-based tokenizations, you can use the command

pykeen tokenize

Its usage is explained by passing the --help flag.

Classes

AnchorSearcher()

A method for finding the closest anchors.

ScipySparseAnchorSearcher([max_iter])

Find closest anchors using scipy.sparse.

CSGraphAnchorSearcher()

Find closest anchors using scipy.sparse.csgraph.

PersonalizedPageRankAnchorSearcher([...])

Select closest anchors as the nodes with the largest personalized page rank.

AnchorSelection([num_anchors])

Anchor entity selection strategy.

SingleSelection([num_anchors])

Single-step selection.

DegreeAnchorSelection([num_anchors])

Select entities according to their (undirected) degree.

MixtureAnchorSelection(selections[, ratios, ...])

A weighted mixture of different anchor selection strategies.

PageRankAnchorSelection([num_anchors])

Select entities according to their page rank.

RandomAnchorSelection([num_anchors, random_seed])

Random node selection.

Tokenizer()

A base class for tokenizers for NodePiece representations.

RelationTokenizer()

Tokenize entities by representing them as a bag of relations.

AnchorTokenizer([selection, ...])

Tokenize entities by representing them as a bag of anchor entities.

PrecomputedPoolTokenizer(*[, path, url, ...])

A tokenizer using externally precomputed tokenization.

PrecomputedTokenizerLoader()

A loader for precomputed tokenization.

GalkinPrecomputedTokenizerLoader()

A loader for pickle files provided by Galkin et al.

TorchPrecomputedTokenizerLoader()

A loader via torch.load.

TokenizationRepresentation(assignment[, ...])

A module holding the result of tokenization.

NodePieceRepresentation(*, triples_factory)

Basic implementation of node piece decomposition [galkin2021].

Class Inheritance Diagram

Inheritance diagram of pykeen.nn.node_piece.anchor_search.AnchorSearcher, pykeen.nn.node_piece.anchor_search.ScipySparseAnchorSearcher, pykeen.nn.node_piece.anchor_search.CSGraphAnchorSearcher, pykeen.nn.node_piece.anchor_search.PersonalizedPageRankAnchorSearcher, pykeen.nn.node_piece.anchor_selection.AnchorSelection, pykeen.nn.node_piece.anchor_selection.SingleSelection, pykeen.nn.node_piece.anchor_selection.DegreeAnchorSelection, pykeen.nn.node_piece.anchor_selection.MixtureAnchorSelection, pykeen.nn.node_piece.anchor_selection.PageRankAnchorSelection, pykeen.nn.node_piece.anchor_selection.RandomAnchorSelection, pykeen.nn.node_piece.tokenization.Tokenizer, pykeen.nn.node_piece.tokenization.RelationTokenizer, pykeen.nn.node_piece.tokenization.AnchorTokenizer, pykeen.nn.node_piece.tokenization.PrecomputedPoolTokenizer, pykeen.nn.node_piece.loader.PrecomputedTokenizerLoader, pykeen.nn.node_piece.loader.GalkinPrecomputedTokenizerLoader, pykeen.nn.node_piece.loader.TorchPrecomputedTokenizerLoader, pykeen.nn.node_piece.representations.TokenizationRepresentation, pykeen.nn.node_piece.representations.NodePieceRepresentation