Negative Sampling¶
For entities \(\mathcal{E}\) and relations \(\mathcal{R}\), the set of all possible triples \(\mathcal{T}\) is constructed through their cartesian product \(\mathcal{T} = \mathcal{E} \times \mathcal{R} \times \mathcal{E}\). A given knowledge graph \(\mathcal{K}\) is a subset of all possible triples \(\mathcal{K} \subseteq \mathcal{T}\).
Construction of Knowledge Graphs¶
When constructing a knowledge graph \(\mathcal{K}_{\text{closed}}\) under the closed world assumption, the labels of the remaining triples \((h,r,t) \in \mathcal{T} \setminus \mathcal{K}_{\text{closed}}\) are defined as negative. When constructing a knowledge graph \(\mathcal{K}_{\text{open}}\) under the open world assumption, the labels of the remaining triples \((h,r,t) \in \mathcal{T} \setminus \mathcal{K}_{\text{open}}\) are unknown.
Becuase most knowledge graphs are generated under the open world assumption, negative sampling techniques must be employed during the training of knowledge graph embedding models to avoid overgeneralization.
Corruption¶
Negative sampling techniques often generate negative triples by corrupting a known positive triple \((h,r,t) \in \mathcal{K}\) by replacing either \(h\), \(r\), or \(t\) with one of the following operations:
Corrupt heads 
\(\mathcal{H}(h, r, t) = \{(h', r, t) \mid h' \in \mathcal{E} \land h' \neq h\}\) 
Corrupt relations 
\(\mathcal{R}(h, r, t) = \{(h, r', t) \mid r' \in \mathcal{E} \land r' \neq r\}\) 
Corrupt tails 
\(\mathcal{T}(h, r, t) = \{(h, r, t') \mid t' \in \mathcal{E} \land t' \neq t\}\) 
Typically, the corrupt relations operation \(\mathcal{R}(h, r, t)\) is omitted becuase the evaluation of knowledge graph embedding models on the link prediction task only consideres the goodness of head prediction and tail prediction, but not relation prediction. Therefore, the set of candidate negative triples \(\mathcal{N}(h, r, t)\) for a given known positive triple \((h,r,t) \in \mathcal{K}\) is given by:
Generally, the set of potential negative triples \(\mathcal{N}\) over all positive triples \((h,r,t) \in \mathcal{K}\) is defined as:
Uniform Negative Sampling¶
The default negative sampler pykeen.sampling.BasicNegativeSampler
generates corrupted triples from
a known positive triple \((h,r,t) \in \mathcal{K}\) by uniformly randomly either using the corrupt heads operation
or the corrupt tails operation. The default negative sampler is automatically used in the following code:
from pykeen.pipeline import pipeline
results = pipeline(
dataset='YAGO310',
model='PairRE',
training_loop='sLCWA',
)
It can be set explicitly with:
from pykeen.pipeline import pipeline
results = pipeline(
dataset='YAGO310',
model='PairRE',
training_loop='sLCWA',
negative_sampler='basic',
)
In general, the behavior of the negative sampler can be modified when using the pykeen.pipeline.pipeline()
by
passing the negative_sampler_kwargs
argument. In order to explicitly specifiy which of the head, relation, and
tail corruption methods are used, the corruption_schema
argument can be used. For example, to use all three,
the collection ('h', 'r', 't')
can be passed as in the following:
from pykeen.pipeline import pipeline
results = pipeline(
dataset='YAGO310',
model='PairRE',
training_loop='sLCWA',
negative_sampler='basic',
negative_sampler_kwargs=dict(
corruption_scheme=('h', 'r', 't'),
),
)
Bernoulli Negative Sampling¶
The Bernoulli negative sampler pykeen.sampling.BernoulliNegativeSampler
generates corrupted triples from
a known positive triple \((h,r,t) \in \mathcal{K}\) similarly to the uniform negative sampler, but it precomputes
a probability \(p_r\) for each relation \(r\) to weight whether the head corruption is used with probability \(p_r\) or
if tail corruption is used with probability \(1  p_r\).
from pykeen.pipeline import pipeline
results = pipeline(
dataset='YAGO310',
model='PairRE',
training_loop='sLCWA',
negative_sampler='bernoulli',
)
Classes¶

A negative sampler. 

A basic negative sampler. 

An implementation of the Bernoulli negative sampling approach proposed by [wang2014]. 
Class Inheritance Diagram¶
Filtering¶
Consider the following properties of relation \(r\). Because the corruption operations (see Corruption) are applied independently of triples, the resulting candidate corrupt triples could overlap with known positive triples in \(\mathcal{K}\).
Property of \(r\) 
Example pair of triples 
Implications 

onetomany 
\((h,r,t_1), (h,r,t_2) \in \mathcal{K}\) 
\((h,r,t_2) \in T(h,r,t_1) \cup (h,r,t_1) \in T(h,r,t_2)\) 
multiple 
\((h,r_1,t), (h,r_2,t) \in \mathcal{K}\) 
\((h,r_2,t) \in R(h,r_1,t) \cup (h,r_1,t) \in R(h,r_2,t)\) 
manytoone 
\((h_1,r,t), (h_2,r,t) \in \mathcal{K}\) 
\((h_2,r,t) \in H(h_1,r,t) \cup (h_1,r,t) \in H(h_2,r,t)\) 
If no relations in \(\mathcal{K}\) satisfy any of the relevant properties for the corruption schema chosen in negative sampling, then there is guaranteed to be no overlap between \(\mathcal{N}\) and \(\mathcal{K}\) such that \(\mathcal{N} \cap \mathcal{K} \neq \emptyset\). However, this scenario is very unlikely for realworld knowledge graphs.
The known positive triples that appear in \(\mathcal{N}\) are known false negatives. Hence, we know that these are incorrect (negative) training examples, and might want to exclude them to reduce the training noise.
Warning
It should be taken into account that also a corrupted triple that is not part of the knowledge graph can represent a true fact. These “unknown” false negatives can not be removed a priori in the filtered setting. The philosophy of the methodology again relies on the low number of unknown false negatives such that learning can take place.
However, in practice, \(\mathcal{N} \gg \mathcal{K}\), so the likelihood of generating a false negative is rather low. Therefore, the additional filter step is often omitted to lower computational cost. This general observation might not hold for all entities; e.g., for a hub entity which is connected to many other entities, there may be a considerable number of false negatives without filtering.
Identifying False Negatives During Training¶
By default, PyKEEN does not filter false negatives from \(\mathcal{N}\) during training. To enable filtering of
negative examples during training, the filtered
keyword can be given to negative_sampler_kwargs
like in:
results = pipeline(
dataset='YAGO310',
model='PairRE',
training_loop='sLCWA',
negative_sampler='basic',
negative_sampler_kwargs=dict(
filtered=True,
),
)
PyKEEN implements several algorithms for filtering with different properties that can be chosen using the
filterer
keyword argument in negative_sampler_kwargs
. By default, an fast and approximate algorithm is used in
pykeen.sampling.filtering.BloomFilterer
, which is based on
bloom filters. The bloom filterer also has a configurable desired error
rate, which can be further lowered at the cost of increase in memory and computation costs.
from pykeen.pipeline import pipeline
results = pipeline(
dataset='YAGO310',
model='PairRE',
training_loop='sLCWA',
negative_sampler='basic',
negative_sampler_kwargs=dict(
filtered=True,
filterer='bloom',
filterer_kwargs=dict(
error_rate=0.0001,
),
),
)
If you want to have a guarantee that all known false negatives are filtered, you can use a slower implementation based
on Python’s builtin sets, the pykeen.sampling.filtering.PythonSetFilterer
. It can be activated with:
from pykeen.pipeline import pipeline
results = pipeline(
dataset='YAGO310',
model='PairRE',
training_loop='sLCWA',
negative_sampler='basic',
negative_sampler_kwargs=dict(
filtered=True,
filterer='pythonset',
),
)
Identifying False Negatives During Evaluation¶
In contrast to training, PyKEEN does filter false negatives from \(\mathcal{N}\) during evaluation by default.
To disable the “filtered setting” during evaluation, the filtered
keyword can be given to evaluator_kwargs
like in:
from pykeen.pipeline import pipeline
results = pipeline(
dataset='YAGO310',
model='PairRE',
evaluator_kwargs=dict(
filtered=False,
),
)
Filtering during evaluation is implemented differently than in negative sampling:
First, there are no choices between an exact or approximate algorithm via a
pykeen.sampling.filtering.Filterer
. Instead, the evaluation filtering can modify the
scores inplace and does so instead of selecting only the nonfiltered entries. The reason is
mainly that evaluation always is done in 1:n scoring, and thus, we gain some efficiently here
by keeping the tensor in “dense” shape (batch_size, num_entities)
.
Second, filtering during evaluation has to be correct, and is crucial for reproducing results from the filtered setting. For evaluation it makes sense to use all information we have to get as solid evaluation results as possible.
Classes¶

An interface for filtering methods for negative triples. 

A filterer for negative triples based on the Bloom filter. 

A filterer using Python sets for filtering. 