NodePiece

pykeen.nn.node_piece Package

NodePiece modules.

A NodePieceRepresentation contains a collection of TokenizationRepresentation. A TokenizationRepresentation is defined as Representation module mapping token indices to representations, also called the vocabulary in resemblance of token representations known from NLP applications, and an assignment from entities to (multiple) tokens.

In order to obtain the vocabulary and assignment, multiple options are available, which often follow a two-step approach of first selecting a vocabulary, and afterwards assigning the entities to the set of tokens, usually using the graph structure of the KG.

One way of tokenization, is tokenization by AnchorTokenizer, which selects some anchor entities from the graph as vocabulary. The anchor selection process is controlled by an AnchorSelection instance. In order to obtain the assignment, some measure of graph distance is used. To this end, a AnchorSearcher instance calculates the closest anchor entities from the vocabulary for each of the entities in the graph.

Since some tokenizations are expensive to compute, we offer a mechanism to use precomputed tokenizations via PrecomputedPoolTokenizer. To enable loading from different formats, a loader subclassing from PrecomputedTokenizerLoader can be selected accordingly. To precompute anchor-based tokenizations, you can use the command

pykeen tokenize

Its usage is explained by passing the --help flag.

Classes

AnchorSearcher()

A method for finding the closest anchors.

ScipySparseAnchorSearcher([max_iter])

Find closest anchors using scipy.sparse.

SparseBFSSearcher([max_iter, device])

Find closest anchors using torch_sparse on a GPU.

CSGraphAnchorSearcher()

Find closest anchors using scipy.sparse.csgraph.

PersonalizedPageRankAnchorSearcher([...])

Select closest anchors as the nodes with the largest personalized page rank.

AnchorSelection([num_anchors])

Anchor entity selection strategy.

SingleSelection([num_anchors])

Single-step selection.

DegreeAnchorSelection([num_anchors])

Select entities according to their (undirected) degree.

MixtureAnchorSelection(selections[, ratios, ...])

A weighted mixture of different anchor selection strategies.

PageRankAnchorSelection([num_anchors])

Select entities according to their page rank.

RandomAnchorSelection([num_anchors, random_seed])

Random node selection.

Tokenizer()

A base class for tokenizers for NodePiece representations.

RelationTokenizer()

Tokenize entities by representing them as a bag of relations.

AnchorTokenizer([selection, ...])

Tokenize entities by representing them as a bag of anchor entities.

MetisAnchorTokenizer([num_partitions, device])

An anchor tokenizer, which first partitions the graph using METIS.

PrecomputedPoolTokenizer(*[, path, url, ...])

A tokenizer using externally precomputed tokenization.

PrecomputedTokenizerLoader()

A loader for precomputed tokenization.

GalkinPrecomputedTokenizerLoader()

A loader for pickle files provided by Galkin et al.

TorchPrecomputedTokenizerLoader()

A loader via torch.load.

TokenizationRepresentation(assignment[, ...])

A module holding the result of tokenization.

NodePieceRepresentation(*, triples_factory)

Basic implementation of node piece decomposition [galkin2021].

HashDiversityInfo(...)

A ratio information object.

Class Inheritance Diagram

digraph inheritance792931ba31 { bgcolor=transparent; rankdir=LR; size="8.0, 12.0"; "ABC" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Helper class that provides a standard way to create an ABC using"]; "AnchorSearcher" [URL="../../api/pykeen.nn.node_piece.AnchorSearcher.html#pykeen.nn.node_piece.AnchorSearcher",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="A method for finding the closest anchors."]; "ExtraReprMixin" -> "AnchorSearcher" [arrowsize=0.5,style="setlinewidth(0.5)"]; "ABC" -> "AnchorSearcher" [arrowsize=0.5,style="setlinewidth(0.5)"]; "AnchorSelection" [URL="../../api/pykeen.nn.node_piece.AnchorSelection.html#pykeen.nn.node_piece.AnchorSelection",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Anchor entity selection strategy."]; "ExtraReprMixin" -> "AnchorSelection" [arrowsize=0.5,style="setlinewidth(0.5)"]; "ABC" -> "AnchorSelection" [arrowsize=0.5,style="setlinewidth(0.5)"]; "AnchorTokenizer" [URL="../../api/pykeen.nn.node_piece.AnchorTokenizer.html#pykeen.nn.node_piece.AnchorTokenizer",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Tokenize entities by representing them as a bag of anchor entities."]; "Tokenizer" -> "AnchorTokenizer" [arrowsize=0.5,style="setlinewidth(0.5)"]; "CSGraphAnchorSearcher" [URL="../../api/pykeen.nn.node_piece.CSGraphAnchorSearcher.html#pykeen.nn.node_piece.CSGraphAnchorSearcher",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Find closest anchors using :class:`scipy.sparse.csgraph`."]; "AnchorSearcher" -> "CSGraphAnchorSearcher" [arrowsize=0.5,style="setlinewidth(0.5)"]; "CombinedRepresentation" [URL="../../api/pykeen.nn.representation.CombinedRepresentation.html#pykeen.nn.representation.CombinedRepresentation",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="A combined representation."]; "Representation" -> "CombinedRepresentation" [arrowsize=0.5,style="setlinewidth(0.5)"]; "DegreeAnchorSelection" [URL="../../api/pykeen.nn.node_piece.DegreeAnchorSelection.html#pykeen.nn.node_piece.DegreeAnchorSelection",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Select entities according to their (undirected) degree."]; "SingleSelection" -> "DegreeAnchorSelection" [arrowsize=0.5,style="setlinewidth(0.5)"]; "ExtraReprMixin" [URL="../utils.html#pykeen.utils.ExtraReprMixin",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="A mixin for modules with hierarchical `extra_repr`."]; "GalkinPrecomputedTokenizerLoader" [URL="../../api/pykeen.nn.node_piece.GalkinPrecomputedTokenizerLoader.html#pykeen.nn.node_piece.GalkinPrecomputedTokenizerLoader",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="A loader for pickle files provided by Galkin *et al*."]; "PrecomputedTokenizerLoader" -> "GalkinPrecomputedTokenizerLoader" [arrowsize=0.5,style="setlinewidth(0.5)"]; "HashDiversityInfo" [URL="../../api/pykeen.nn.node_piece.HashDiversityInfo.html#pykeen.nn.node_piece.HashDiversityInfo",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="A ratio information object."]; "MetisAnchorTokenizer" [URL="../../api/pykeen.nn.node_piece.MetisAnchorTokenizer.html#pykeen.nn.node_piece.MetisAnchorTokenizer",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="An anchor tokenizer, which first partitions the graph using METIS."]; "AnchorTokenizer" -> "MetisAnchorTokenizer" [arrowsize=0.5,style="setlinewidth(0.5)"]; "MixtureAnchorSelection" [URL="../../api/pykeen.nn.node_piece.MixtureAnchorSelection.html#pykeen.nn.node_piece.MixtureAnchorSelection",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="A weighted mixture of different anchor selection strategies."]; "AnchorSelection" -> "MixtureAnchorSelection" [arrowsize=0.5,style="setlinewidth(0.5)"]; "Module" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Base class for all neural network modules."]; "NodePieceRepresentation" [URL="../../api/pykeen.nn.node_piece.NodePieceRepresentation.html#pykeen.nn.node_piece.NodePieceRepresentation",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Basic implementation of node piece decomposition [galkin2021]_."]; "CombinedRepresentation" -> "NodePieceRepresentation" [arrowsize=0.5,style="setlinewidth(0.5)"]; "PageRankAnchorSelection" [URL="../../api/pykeen.nn.node_piece.PageRankAnchorSelection.html#pykeen.nn.node_piece.PageRankAnchorSelection",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Select entities according to their page rank."]; "SingleSelection" -> "PageRankAnchorSelection" [arrowsize=0.5,style="setlinewidth(0.5)"]; "PersonalizedPageRankAnchorSearcher" [URL="../../api/pykeen.nn.node_piece.PersonalizedPageRankAnchorSearcher.html#pykeen.nn.node_piece.PersonalizedPageRankAnchorSearcher",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Select closest anchors as the nodes with the largest personalized page rank."]; "AnchorSearcher" -> "PersonalizedPageRankAnchorSearcher" [arrowsize=0.5,style="setlinewidth(0.5)"]; "PrecomputedPoolTokenizer" [URL="../../api/pykeen.nn.node_piece.PrecomputedPoolTokenizer.html#pykeen.nn.node_piece.PrecomputedPoolTokenizer",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="A tokenizer using externally precomputed tokenization."]; "Tokenizer" -> "PrecomputedPoolTokenizer" [arrowsize=0.5,style="setlinewidth(0.5)"]; "PrecomputedTokenizerLoader" [URL="../../api/pykeen.nn.node_piece.PrecomputedTokenizerLoader.html#pykeen.nn.node_piece.PrecomputedTokenizerLoader",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="A loader for precomputed tokenization."]; "ABC" -> "PrecomputedTokenizerLoader" [arrowsize=0.5,style="setlinewidth(0.5)"]; "RandomAnchorSelection" [URL="../../api/pykeen.nn.node_piece.RandomAnchorSelection.html#pykeen.nn.node_piece.RandomAnchorSelection",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Random node selection."]; "SingleSelection" -> "RandomAnchorSelection" [arrowsize=0.5,style="setlinewidth(0.5)"]; "RelationTokenizer" [URL="../../api/pykeen.nn.node_piece.RelationTokenizer.html#pykeen.nn.node_piece.RelationTokenizer",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Tokenize entities by representing them as a bag of relations."]; "Tokenizer" -> "RelationTokenizer" [arrowsize=0.5,style="setlinewidth(0.5)"]; "Representation" [URL="../../api/pykeen.nn.representation.Representation.html#pykeen.nn.representation.Representation",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="A base class for obtaining representations for entities/relations."]; "Module" -> "Representation" [arrowsize=0.5,style="setlinewidth(0.5)"]; "ExtraReprMixin" -> "Representation" [arrowsize=0.5,style="setlinewidth(0.5)"]; "ABC" -> "Representation" [arrowsize=0.5,style="setlinewidth(0.5)"]; "ScipySparseAnchorSearcher" [URL="../../api/pykeen.nn.node_piece.ScipySparseAnchorSearcher.html#pykeen.nn.node_piece.ScipySparseAnchorSearcher",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Find closest anchors using :mod:`scipy.sparse`."]; "AnchorSearcher" -> "ScipySparseAnchorSearcher" [arrowsize=0.5,style="setlinewidth(0.5)"]; "SingleSelection" [URL="../../api/pykeen.nn.node_piece.SingleSelection.html#pykeen.nn.node_piece.SingleSelection",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Single-step selection."]; "AnchorSelection" -> "SingleSelection" [arrowsize=0.5,style="setlinewidth(0.5)"]; "ABC" -> "SingleSelection" [arrowsize=0.5,style="setlinewidth(0.5)"]; "SparseBFSSearcher" [URL="../../api/pykeen.nn.node_piece.SparseBFSSearcher.html#pykeen.nn.node_piece.SparseBFSSearcher",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Find closest anchors using :mod:`torch_sparse` on a GPU."]; "AnchorSearcher" -> "SparseBFSSearcher" [arrowsize=0.5,style="setlinewidth(0.5)"]; "TokenizationRepresentation" [URL="../../api/pykeen.nn.node_piece.TokenizationRepresentation.html#pykeen.nn.node_piece.TokenizationRepresentation",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="A module holding the result of tokenization."]; "Representation" -> "TokenizationRepresentation" [arrowsize=0.5,style="setlinewidth(0.5)"]; "Tokenizer" [URL="../../api/pykeen.nn.node_piece.Tokenizer.html#pykeen.nn.node_piece.Tokenizer",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="A base class for tokenizers for NodePiece representations."]; "TorchPrecomputedTokenizerLoader" [URL="../../api/pykeen.nn.node_piece.TorchPrecomputedTokenizerLoader.html#pykeen.nn.node_piece.TorchPrecomputedTokenizerLoader",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="A loader via torch.load."]; "PrecomputedTokenizerLoader" -> "TorchPrecomputedTokenizerLoader" [arrowsize=0.5,style="setlinewidth(0.5)"]; }