KG2E¶
-
class
KG2E
(triples_factory, embedding_dim=50, loss=None, preferred_device=None, random_seed=None, dist_similarity=None, c_min=0.05, c_max=5.0, regularizer=None, entity_initializer=<function uniform_>, entity_constrainer=<function clamp_norm>, entity_constrainer_kwargs=None, relation_initializer=<function uniform_>, relation_constrainer=<function clamp_norm>, relation_constrainer_kwargs=None)[source]¶ Bases:
pykeen.models.base.EntityRelationEmbeddingModel
An implementation of KG2E from [he2015].
KG2E aims to explicitly model (un)certainties in entities and relations (e.g. influenced by the number of triples observed for these entities and relations). Therefore, entities and relations are represented by probability distributions, in particular by multi-variate Gaussian distributions \(\mathcal{N}_i(\mu_i,\Sigma_i)\) where the mean \(\mu_i \in \mathbb{R}^d\) denotes the position in the vector space and the diagonal variance \(\Sigma_i \in \mathbb{R}^{d \times d}\) models the uncertainty. Inspired by the
pykeen.models.TransE
model, relations are modeled as transformations from head to tail entities: \(\mathcal{H} - \mathcal{T} \approx \mathcal{R}\) where \(\mathcal{H} \sim \mathcal{N}_h(\mu_h,\Sigma_h)\), \(\mathcal{H} \sim \mathcal{N}_t(\mu_t,\Sigma_t)\), \(\mathcal{R} \sim \mathcal{P}_r = \mathcal{N}_r(\mu_r,\Sigma_r)\), and \(\mathcal{H} - \mathcal{T} \sim \mathcal{P}_e = \mathcal{N}_{h-t}(\mu_h - \mu_t,\Sigma_h + \Sigma_t)\) (since head and tail entities are considered to be independent with regards to the relations). The interaction model measures the similarity between \(\mathcal{P}_e\) and \(\mathcal{P}_r\) by means of the Kullback-Liebler Divergence (KG2E.kullback_leibler_similarity()
).\[f(h,r,t) = \mathcal{D_{KL}}(\mathcal{P}_e, \mathcal{P}_r)\]Besides the asymmetric KL divergence, the authors propose a symmetric variant which uses the expected likelihood (
KG2E.expected_likelihood()
)\[f(h,r,t) = \mathcal{D_{EL}}(\mathcal{P}_e, \mathcal{P}_r)\]Initialize KG2E.
- Parameters
Attributes Summary
The default settings for the entity constrainer
The default strategy for optimizing the model’s hyper-parameters
Methods Summary
expected_likelihood
(mu_e, mu_r, sigma_e, sigma_r)Compute the similarity based on expected likelihood.
kullback_leibler_similarity
(mu_e, mu_r, …)Compute the similarity based on KL divergence.
Has to be called after each parameter update.
score_h
(rt_batch)Forward pass using left side (head) prediction.
score_hrt
(hrt_batch)Forward pass.
score_t
(hr_batch)Forward pass using right side (tail) prediction.
Attributes Documentation
-
constrainer_default_kwargs
= {'dim': -1, 'maxnorm': 1.0, 'p': 2}¶ The default settings for the entity constrainer
-
hpo_default
: ClassVar[Mapping[str, Any]] = {'c_max': {'high': 10.0, 'low': 1.0, 'type': <class 'float'>}, 'c_min': {'high': 0.1, 'low': 0.01, 'scale': 'log', 'type': <class 'float'>}, 'embedding_dim': {'high': 256, 'low': 16, 'q': 16, 'type': <class 'int'>}}¶ The default strategy for optimizing the model’s hyper-parameters
Methods Documentation
-
static
expected_likelihood
(mu_e, mu_r, sigma_e, sigma_r, epsilon=1e-10)[source]¶ Compute the similarity based on expected likelihood.
\[D((\mu_e, \Sigma_e), (\mu_r, \Sigma_r))) = \frac{1}{2} \left( (\mu_e - \mu_r)^T(\Sigma_e + \Sigma_r)^{-1}(\mu_e - \mu_r) + \log \det (\Sigma_e + \Sigma_r) + d \log (2 \pi) \right) = \frac{1}{2} \left( \mu^T\Sigma^{-1}\mu + \log \det \Sigma + d \log (2 \pi) \right)\]- Parameters
mu_e (
FloatTensor
) – torch.Tensor, shape: (s_1, …, s_k, d) The mean of the first Gaussian.mu_r (
FloatTensor
) – torch.Tensor, shape: (s_1, …, s_k, d) The mean of the second Gaussian.sigma_e (
FloatTensor
) – torch.Tensor, shape: (s_1, …, s_k, d) The diagonal covariance matrix of the first Gaussian.sigma_r (
FloatTensor
) – torch.Tensor, shape: (s_1, …, s_k, d) The diagonal covariance matrix of the second Gaussian.epsilon (
float
) – float (default=1.0) Small constant used to avoid numerical issues when dividing.
- Return type
FloatTensor
- Returns
torch.Tensor, shape: (s_1, …, s_k) The similarity.
-
static
kullback_leibler_similarity
(mu_e, mu_r, sigma_e, sigma_r, epsilon=1e-10)[source]¶ Compute the similarity based on KL divergence.
This is done between two Gaussian distributions given by mean mu_* and diagonal covariance matrix sigma_*.
\[D((\mu_e, \Sigma_e), (\mu_r, \Sigma_r))) = \frac{1}{2} \left( tr(\Sigma_r^{-1}\Sigma_e) + (\mu_r - \mu_e)^T\Sigma_r^{-1}(\mu_r - \mu_e) - \log \frac{det(\Sigma_e)}{det(\Sigma_r)} - k_e \right)\]- Note: The sign of the function has been flipped as opposed to the description in the paper, as the
Kullback Leibler divergence is large if the distributions are dissimilar.
- Parameters
mu_e (
FloatTensor
) – torch.Tensor, shape: (s_1, …, s_k, d) The mean of the first Gaussian.mu_r (
FloatTensor
) – torch.Tensor, shape: (s_1, …, s_k, d) The mean of the second Gaussian.sigma_e (
FloatTensor
) – torch.Tensor, shape: (s_1, …, s_k, d) The diagonal covariance matrix of the first Gaussian.sigma_r (
FloatTensor
) – torch.Tensor, shape: (s_1, …, s_k, d) The diagonal covariance matrix of the second Gaussian.epsilon (
float
) – float (default=1.0) Small constant used to avoid numerical issues when dividing.
- Return type
FloatTensor
- Returns
torch.Tensor, shape: (s_1, …, s_k) The similarity.
-
score_h
(rt_batch)[source]¶ Forward pass using left side (head) prediction.
This method calculates the score for all possible heads for each (relation, tail) pair.
- Parameters
rt_batch (
LongTensor
) – shape: (batch_size, 2), dtype: long The indices of (relation, tail) pairs.- Return type
FloatTensor
- Returns
shape: (batch_size, num_entities), dtype: float For each r-t pair, the scores for all possible heads.
-
score_hrt
(hrt_batch)[source]¶ Forward pass.
This method takes head, relation and tail of each triple and calculates the corresponding score.
- Parameters
hrt_batch (
LongTensor
) – shape: (batch_size, 3), dtype: long The indices of (head, relation, tail) triples.- Raises
NotImplementedError – If the method was not implemented for this class.
- Return type
FloatTensor
- Returns
shape: (batch_size, 1), dtype: float The score for each triple.
-
score_t
(hr_batch)[source]¶ Forward pass using right side (tail) prediction.
This method calculates the score for all possible tails for each (head, relation) pair.
- Parameters
hr_batch (
LongTensor
) – shape: (batch_size, 2), dtype: long The indices of (head, relation) pairs.- Return type
FloatTensor
- Returns
shape: (batch_size, num_entities), dtype: float For each h-r pair, the scores for all possible tails.