Smurf
文章50
标签0
分类6
Statistical Resoning

Statistical Resoning

Statistical Resoning

Statistical Resoning

1.Def

  • Statistical Reasoning tries to find suitable statistical models to fit the samples and predicts the expected probabilities of the inferred knowledge. 预测未来知识出现的概率

  • knowledge graph embedding based reasoning

  • inductive rule learning based reasoning
  • multi-hop reasoning

image-20211020110016344

Tasks:
  • Predicting the missing link.
  • Given e1 and e2, predict the relation r.
  • Predicting the missing entity.
  • Given e1 (e2)and relation r, predict the missing entity e2 (e1).
  • Fact Prediction.
  • Given a triple, predict whether it is true or false.

2. Embedding: Meaning of a Word

  • What is the meaning of a word?
  • By ontologies? By Knowledge Graph?
  • But ontologies and KGs are hard to construct and often incomplete 无法穷举
  • How to encode the meaning of a word?

3. One-hot Representation

  • Vocabulary: (cat, mat, on, sat, the)
    • cat: 10000 mat: 01000 on: 00100 sat: 00010 the: 00001
  • “The cat sat on the mat”

image-20211020110602159

  • Disadvantage: too sparsity

image-20211020110650083

  • One-hot representation:
    • Foundation of Bag-of-words Model
  • 无法衡量语义相关度

image-20211020110804971

4. Distributional Representation

  • When a word w appears in text, its context is the set of words that appear nearby (within a fixed-size window): 用中心词周围的词表示该词

  • Use many contexts of w to build up a representation of w

image-20211020111021157

  • 建立一个稠密向量

5. Word Vectors

  • We will build a dense vector for each word, chosen so that it is similar to vectors of words that appear in similar contexts.

image-20211020111151447

  • Note: word vectors are sometimes called word embeddings. They are a distributed representation.
Similarity:

6. Advantage of Distributed Representation

  1. Deal with data sparsity problem in NLP
  2. Realize knowledge transfer across domains and across objects
  3. Provide a unified representation for multi-task learning

image-20211020111459513

6.1 Representation Learning

  • What is the representation learning?
    • Objects are represented as dense, real-value and low-dimensional vector

image-20211020111641915

6.2 Different ways of KG Representation

image-20211020111736761

  • Tensor: 自由度更高,隐式知识,但不容易扩展,不容易解释

6.3 Knowledge Graph Embedding: Application

  • Entity Prediction
    • 卧虎藏龙 Has-director ?
    • 卧虎藏龙 Has-director:Ang Lee

image-20211020111926886

  • Relation Prediction

image-20211020112009445

  • Recommendation System

image-20211020112033715

7. TransE: Take Relation as Translation

  • For a fact (head, relation, tail), take the relation as a translation operator from the head to the tail .

image-20211020112124579

  • 实体经过关系的翻译到另一个实体

TransE

  • For each triple , h is translated to t by r.

image-20211020112432731

  • Train TransE Energy Function:
  • If the triple is true, the translated distance between (h + r) and t is shorter.

  • L1 (Manhattan) distance:

  • L2 (Euclidean) distance:

TransE

  • Triple1:
  • Triple2:
  • Triple3:
  • false triple examples:
How to distinguish?(true and false)

image-20211020113751686

  • Minimize the distance between (h+l) and t.
  • Maximize the distance between (h’+l) to a randomly sampled tail t’ (negative example).
    • 最小化正类表示的差距,最大化负类表示的差距

image-20211020113910657

  • Tbatch就是一个正例和负例元组的集合
  1. input Training set $S=\{(h, \ell, t)\}$, entities and relations. sets $E$ and $L$, margin $\lambda$, embeddings dim. $k$.
  2. Initialize entity and relationship embedding;
  3. Entity and relationship embedding normalization;
    For each entity e(Suppose there are M elements in the entity set E)

4、Negative Sampling

image-20211020114717862

  • Evaluation protocol:

Metrics: 遍历所有实例,进行距离计算,并排序

  • Link Prediction
    ( WALL-E , _has_genre , ? )

  • Mean Ranks: the mean of those predicted ranks.

  • Hits@10: the proportion of correct entities ranked in the top 10.
    e.g. Entity 1: rank -> 50; Entity 2: rank -> 100; MR = (50+100)/2 = 75

8.Question

We have two types of relations in KG, for example:

  • Symmetric Relation:

    • e.g., (stu1, classmate, stu2), (stu2, classmate, stu1)
  • Composition(组合) Relation:

    • e.g., (B, husband_of, A),(A, mother_of, C),(B, father_of, C)

Which Relation can be modeled by TransE? Why?

  • TransE cannot model symmetric relations

image-20211020115544383

  • TransE can model composition relations,when $r_3=r_1+r_2$

image-20211020115742546

  • Can TransE model 1-to-N relations?
    • e.g., (qiguilin, teacher_of, stu1), (qiguilin, teacher_of, stu2),
      (qiguilin, teacher_of, stu3), (qiguilin, teacher_of, stu4)…
    • 不能,否则stui均相等

Issue of TransE

  • TransE is too simple to handle complex relations
    • 1-to-N, N-to-1, N-to-N relations 不可能发生

image-20211020115935640

9. Variants(变种) of TransE: TransH

For each relation, define a hyperplane $W_r$​ and a relation vector dr. Then project the head entity vector $h$ and the tail entity vector $t$ onto the hyperplane $W_r$. 将向量映射到超平面做翻译

image-20211020120302051

image-20211020120319689

For example:

in TransE, h and h’’ will overlap. While in TransH, entity h and entity h’’ will overlap only with the projection h⊥.

image-20211020120717200

10. Variants of TransE: TransR

  • Both TransE and TransH models assume that entities and relationships are vectors in the same semantic space.

image-20211020120855724

  • 假设每一个关系,有自己的向量空间
  • 因为毛主席和奥巴马虽然在总统空间接近,但是诗人空间却是不接近

TransR proposes:

  • Build entity and relation embeddings in the separate entity space and relation spaces;

  • Then projecting entities from entity space to the corresponding relation space and building translations between projected entities.

TransR:

  • Mapping entity embeddings into different semantic spaces

image-20211020121232068

  • The score(energy) function is correspondingly defined as (same as TransE):

11. Summary

  • Statistical reasoning uses statistical models to fit the samples and predicts the expected probabilities of the inferred knowledge .

  • Knowledge graph embedding based reasoning actually performs entity prediction and relation prediction with vector calculations.

  • Translation-based models are now widely used KG embedding models for KG completion and other applications due to its good performance and succinctness.

本文作者:Smurf
本文链接:http://example.com/2021/08/15/knowledge%20engineering/6%20Statistical%20Resoning/
版权声明:本文采用 CC BY-NC-SA 3.0 CN 协议进行许可