BernoulliNegativeSampler¶
-
class
BernoulliNegativeSampler
(triples_factory, num_negs_per_pos=None, filtered=False)[source]¶ Bases:
pykeen.sampling.negative_sampler.NegativeSampler
An implementation of the Bernoulli negative sampling approach proposed by [wang2014].
The probability of corrupting the head \(h\) or tail \(t\) in a relation \((h,r,t) \in \mathcal{K}\) is determined by global properties of the relation \(r\):
\(r\) is one-to-many (e.g. motherOf): a higher probability is assigned to replace \(h\)
\(r\) is many-to-one (e.g. bornIn): a higher probability is assigned to replace \(t\).
More precisely, for each relation \(r \in \mathcal{R}\), the average number of tails per head (
tph
) and heads per tail (hpt
) are first computed.Then, the head corruption probability \(p_r\) is defined as \(p_r = \frac{tph}{tph + hpt}\). The tail corruption probability is defined as \(1 - p_r = \frac{hpt}{tph + hpt}\).
For each triple \((h,r,t) \in \mathcal{K}\), the head is corrupted with probability \(p_r\) and the tail is corrupted with probability \(1 - p_r\).
If
filtered
is set toTrue
, all proposed corrupted triples that also exist as actual positive triples \((h,r,t) \in \mathcal{K}\) will be removed.Initialize the negative sampler with the given entities.
- Parameters
triples_factory (
TriplesFactory
) – The factory holding the triples to sample fromnum_negs_per_pos (
Optional
[int
]) – Number of negative samples to make per positive triple. Defaults to 1.filtered (
bool
) – Whether proposed corrupted triples that are in the training data should be filtered. Defaults to False. See explanation infilter_negative_triples()
for why this is a reasonable default.
Attributes Summary
The default strategy for optimizing the negative sampler’s hyper-parameters
Methods Summary
sample
(positive_batch)Sample a negative batched based on the bern approach.
Attributes Documentation
-
hpo_default
: ClassVar[Mapping[str, Mapping[str, Any]]] = {'num_negs_per_pos': {'high': 100, 'low': 1, 'q': 10, 'type': <class 'int'>}}¶ The default strategy for optimizing the negative sampler’s hyper-parameters
Methods Documentation