BernoulliNegativeSampler¶

class BernoulliNegativeSampler(triples_factory, num_negs_per_pos=None, filtered=False)[source]¶

Bases: pykeen.sampling.negative_sampler.NegativeSampler

An implementation of the Bernoulli negative sampling approach proposed by [wang2014].

The probability of corrupting the head \(h\) or tail \(t\) in a relation \((h,r,t) \in \mathcal{K}\) is determined by global properties of the relation \(r\):

\(r\) is one-to-many (e.g. motherOf): a higher probability is assigned to replace \(h\)
\(r\) is many-to-one (e.g. bornIn): a higher probability is assigned to replace \(t\).

More precisely, for each relation \(r \in \mathcal{R}\), the average number of tails per head (tph) and heads per tail (hpt) are first computed.

Then, the head corruption probability \(p_r\) is defined as \(p_r = \frac{tph}{tph + hpt}\). The tail corruption probability is defined as \(1 - p_r = \frac{hpt}{tph + hpt}\).

For each triple \((h,r,t) \in \mathcal{K}\), the head is corrupted with probability \(p_r\) and the tail is corrupted with probability \(1 - p_r\).

If filtered is set to True, all proposed corrupted triples that also exist as actual positive triples \((h,r,t) \in \mathcal{K}\) will be removed.

Initialize the negative sampler with the given entities.

Parameters

triples_factory (TriplesFactory) – The factory holding the triples to sample from
num_negs_per_pos (Optional[int]) – Number of negative samples to make per positive triple. Defaults to 1.
filtered (bool) – Whether proposed corrupted triples that are in the training data should be filtered. Defaults to False. See explanation in filter_negative_triples() for why this is a reasonable default.

Attributes Summary

hpo_default

The default strategy for optimizing the negative sampler’s hyper-parameters

Methods Summary

sample(positive_batch)

Sample a negative batched based on the bern approach.

Attributes Documentation

hpo_default: ClassVar[Mapping[str, Mapping[str, Any]]] = {'num_negs_per_pos': {'high': 100, 'low': 1, 'q': 10, 'type': <class 'int'>}}¶: The default strategy for optimizing the negative sampler’s hyper-parameters

Methods Documentation

sample(positive_batch)[source]¶

Sample a negative batched based on the bern approach.

Return type: Tuple[LongTensor, Optional[Tensor]]