Smurf
文章50
标签0
分类6
Knowledge Graph Querying

Knowledge Graph Querying

Knowledge Graph Querying

1. RDF Query Language: SPARQL

  • W3C Stack
  • SPARQL is an RDF query language, that is, a semantic query language for databases, able to retrieve and manipulate data stored in RDF format.

image-20211208102903230

  • SPARQL stands for

    • (originally) Simple Protocol and RDF Query Language.
    • (now) SPARQL Protocol and RDF Query Language.
  • How to get information from RDF graphs by SPARQL?

    • Pattern matching
      • Pattern: describe subgraphs of the queried RDF graph
      • Matching: match the pattern to the subgraphs to contribute an answer
      • Building blocks: graph patterns (i.e. RDF graphs with variables)

image-20211208103325844

image-20211208103415086

  • 子图与知识图谱做匹配,把知识图谱中所有位于中心的词拿过来

1.1 SPARQL Example

image-20211208104153837

1.2 Components of SPARQL Queries

image-20211208104425927

  • Prologue

    • Prefix definitions for compact URIs;
    • Attention (difference w.r.t. Turtle): No period (“.”) character as separator.
  • Query Form

    • SELECT, DESCRIBE, CONSTRUCT or ASK
  • Dataset specification

    • From
    • Specify the RDF dataset to be querying
  • Query pattern

    • WHERE clause specifies the graph pattern to be matched
    • 指明什么样的语句需要被匹配
  • Solution modifiers

    • Order: put the solutions in order;
    • Projection: choose certain variables;
      • 指定返回特定变脸结果
    • Distinct: ensure solutions in the sequence are unique;
      • 确保返回结果是唯一
    • Reduced: permit elimination of some non-unique solutions;
      • 防止删除重复
    • Offset: control where the solutions start from in the overall sequence of solutions;
      • 从第n个返回
    • Limit: restrict the number of solutions.
      • 限制返回数量

2. SPARQL

  • SPARQL Syntax (RDF term syntax)
    • Syntax for IRI
    • Syntax for literals
    • Syntax for variables
    • Syntax for blank nodes
    • Graph Patterns for Query Pattern
    • Triple Pattern
    • Different Graph Patterns

2.1 SPARQL Syntax: RDF Term Syntax

  • Syntax for IRI
    • IRIs are a generalization of URIs and are fully compatible with URIs and URLs.
    • The following fragments are some of the different ways to write the same IRI:

image-20211208105032370

  • 一旦定义BASE,那么所有的都是这一个;PREFIX则不同

  • The general syntax for literals

    • A string (enclosed in either double quotes, “…”, or single quotes, ‘…’);
    • With either an optional language tag (introduced by @) or an optional datatype IRI or prefixed name (introduced by ^^).

image-20211208105224660

  • 字符串用单引号或双引号皆可

  • Syntax for query variables

    • Query variables in SPARQL queries have the global scope
      • Use of a given variable name anywhere in a query identifies the same variable.
    • Variables are prefixed by either “?” or “\$”;
      • The “?” or “\$​” is not part of the variable name.

image-20211208105526982

  • 查找所有谓语是name的三元组,返回其并于

  • Syntax for blank nodes

    • 空节点就是表示变量
    • Blank nodes in graph patterns act as non-distinguished variables, not as references to specific blank nodes in the data being queried.
    • Blank nodes are indicated by either the label form, such as “_:abc”, or the abbreviated form “[]”.
      • A blank node that is used in only one place in the query syntax can be indicated with [].
    • The same blank node label cannot be used in two different basic graph patterns in the same query.

image-20211208105746800

  • Triple Patterns
    • are basic units of graph patterns;
    • are written as a whitespace-separated list of a subject, predicate and object;
    • There are abbreviated ways of writing some common triple patterns;
    • The following examples express the same query:

image-20211208110026967

Triple Patterns: Predicate-Object Lists

  • Triple patterns with a common subject can be written so that the subject is only written once and is used for more than one triple pattern by employing the “;” notation.
  • 共用主语

image-20211208110053408

Exercise

  • Please rewrite the following triple pattern by using the same subject only once.
1
2
3
?a int:number 123789 .
?a int:pair 34567.
?a int:id 666777 .
1
2
3
?a int:number 123789;
int:pair 34567;
int:id 666777 .
  • Triple Patterns: Object Lists
    • If triple patterns share both subject and predicate, the objects may be separated by “,”.

image-20211208110551322

Exercise

  • Please rewrite the following triple pattern by using the same subject and predicate only once.
1
2
3
4
?a string:name ‘Bob’.
?a string:name ’Bobby’.
?a string:name ‘Boob’.
?a string:name ‘Bob_’.
1
?a string:name ‘Bob’,’Bobby’,‘Boob’,‘Bob_’.

2.2 Graph Patterns

  • SPARQL is based around graph pattern matching.
  • Different types of graph patterns for the query pattern (WHERE clause):
    • Basic Graph Patterns, where a set of triple patterns must match;
    • Group Graph Pattern, where a set of graph patterns must all match;
    • Optional Graph patterns, where additional patterns may extend the solution;
    • Alternative (Union) Graph Pattern, where two or more possible patterns are tried;

2.2.1 Basic graph patterns

Triple patterns are similar to RDF triples, but any component can be a
query variable.

1
?x foaf:name ?name .
  • Matching a triple pattern to a graph: bindings between variables and
    RDF terms.

  • Matching of basic graph patterns:

    • A Pattern Solution of Graph Pattern GP on graph G is any substitution S such that S(GP) is a subgraph of G.
      • Simple queries
      • Multiple matches
      • Matching RDF literals
      • Blank node labels in query results
  • Simple queries

image-20211208111145159

  • 注意返回要有列名

Exercise
Given the dataset and the SPARQL query as follows, please write the query results.

1
2
3
4
5
6
7
8
# Graph: http://example/addresses
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<http://example/president25> foaf:givenName "Bill" .
<http://example/president25> foaf:familyName "McKinley" .
<http://example/president27> foaf:givenName "Bill" .
<http://example/president27> foaf:familyName "Taft" .
<http://example/president42> foaf:givenName "Bill" .
<http://example/president42> foaf:familyName "Clinton" .
1
2
3
4
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?family
WHERE { ?x foaf:givenName ?name .
?x foaf:familyName ?family .}
name family
“Bill” “McKinley”
“Bill” “Taft”
“Bill” “Clinton”
  • Multiple Matches

image-20211208112118772

Exercise
  • Given the dataset and the SPARQL query as follows, please write the query results.
1
2
3
4
5
6
7
8
# Graph: http://example/addresses
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<http://example/president25> foaf:givenName "Bill" .
<http://example/president25> foaf:familyName "McKinley" .
<http://example/president27> foaf:givenName “Bob" .
<http://example/president27> foaf:id 159486.
<http://example/president42> foaf:givenName “Marry" .
<http://example/president42> foaf:familyName "Clinton" .
1
2
3
4
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?family
WHERE { ?x foaf:givenName ?name .
?x foaf:familyName ?family .}
name family
“Bill” “McKinley”
“Marry” “Clinton”
  • Matching RDF literals
1
2
3
4
5
6
7
@prefix dt: <http://example.org/datatype#> .
@prefix ns: <http://example.org/ns#> .
@prefix : <http://example.org/ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
:x ns:p "cat"@en .
:y ns:p "42"^^xsd:integer .
:z ns:p "abc"^^dt:specialDatatype .
  • 第一种是找不到结果的,因为不加@是不一样的意义

image-20211208112234236

  • Blank node labels in query results

    • There need not be any relation between a label in the form of a blank node in the result set and a blank node in the data graph with the same label.
  • Blank node labels in query results

    • There need not be any relation between a label in the form of a blank node in the result set and a blank node in the data graph with the same label.

image-20211208113210921

2.2.2 Group Graph Patterns

  • In a SPARQL query string, a group graph pattern is delimited with braces: {}.

image-20211208113243218

  • Besides triple patterns, group graph pattern can contain constraints
    • Syntax: Keyword FILTER followed by a filter expression

image-20211208113552720

name family
“Bill” “McKinley”

2.2.3 Optional Graph Patterns

  • If the optional part does not match, it creates no bindings but does not eliminate the solution.
  • Optional patterns may result in unbound variables

image-20211208163455567

  • Exercise
    • Given the dataset and the SPARQL query as follows, please write the query results.

image-20211208163522593

  • Answer
name family
“Bill” “McKinley”
“Bob”
“Marry”
  • Constraints in Optional Pattern Matching

image-20211208163647154

  • Multiple Optional Graph Patterns

image-20211208163807370

2.2.4 Alternative(Union) Graph Patterns

  • Combine graph patterns so that one of several alternative graph patterns may match.

image-20211208163833063

Exercise
  • Please write the SPARQL query on “list all volcanos located in Italy or Norway” given the following data.
1
2
3
4
5
6
7
8
depedia:Mount_Etna	rdf:type	umbel-sc:Volcano;
rdfs:label "Etna";
p:location dbpedia:Italy.
depedia:Mount_Baker rdf:type umbel-sc:Volcano;
p:location dbpedia:United_States.
depedia:Beerenberg rdf:type umbel-sc:Volcano;
rdfs:label "Beerenberg"@en;
p:location dbpedia:Norway.
  • Answer
1
2
3
4
5
SELECT ?volcano rdf:type umbel-sc:Volcano.
WHERE {
{?Mount p:location dbpedia:Italy.}
UNION {?Mount p:location dbpedia:Norway.}
}

2.3 Dataset specification

  • SPARQL queries are executed over an RDF dataset:

    • One default graph and
    • Zero or more named graphs (identified by an IRI).
  • Evaluation of patterns w.r.t. the active graph (initially the default graph),
    i.e., the graph used for matching graph patterns;

  • GRAPH clause is used for making a named graph the active graph.

image-20211215095900388

  • 加了Graph 子句就会在那么graph查询,否则就会在defalt查询

image-20211215095921101

image-20211215100228264

2.4 Query Forms

  • SELECT
    • Result: sequence of solutions (i.e., sets of variable bindings);
    • Selected variables separated by space (not by comma!);
    • Asterisk character (“*”) selects all variables in the pattern.

image-20211215100406729

Exercise

  • Given the dataset and the SPARQL query as follows, please write the query results.
  • Dataset
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Default graph (stored at http://example.org/dft.ttl)
@prefix dc: <http://purl.org/dc/elements/1.1/> .
<http://example.org/bob> dc:publisher "Bob Hacker" .
<http://example.org/alice> dc:publisher "Alice Hacker" .

# Named graph: http://example.org/bob
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
_:a foaf:name "Bob" .
_:a foaf:mbox <mailto:bob@oldcorp.example.org> .

# Named graph: http://example.org/alice
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
_:a foaf:name "Alice" .
_:a foaf:mbox <mailto:alice@work.example.org> .
  • SPARQL query
    • FROMNAMD可以省略,因为标明了GRAPH ?g
1
2
3
4
5
6
7
8
9
10
11
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?who ?g ?mbox
FROM <http://example.org/dft.ttl>
FROM NAMED <http://example.org/alice>
FROM NAMED <http://example.org/bob>
WHERE
{
?g dc:publisher ?who .
GRAPH ?g { ?x foaf:mbox ?mbox }
}

2.5 DESCRIBE

  • Result: an RDF graph (i.e., all RDF triples) that describes the resources found;
  • The DESCRIBE clause can take IRIs to identify the resources.
  • The resources to be described can also be taken from the bindings to a query variable in a result set.

image-20211215101912520

image-20211215102306783

2.6 CONSTRUCT

  • Result: an RDF graph constructed from a template;

  • Template: a graph pattern with the variables from the query pattern.

  • 将模板的变量换掉,其余不变

image-20211215102537976

  • dbpedia:Mount_Etna rdfs:label “Etna”;

    ​ rdf:type myTypes:VolcanosOutsideTheUS.

  • dbpedia:Beerenberg rdfs:label “Beerenberg”@en;

    ​ rdf:type myTypes:VolcanosOutsideTheUS.

2.7 ASK

  • Check whether there is at least one result;
  • Result: true or false.

image-20211215104717764

2.8 DELETE/INSERT

  • INSERT
    • Insert the new RDF triples into the existing RDF graph.

image-20211215104928526

  • DELETE
    • Delete some triples in the RDF graph.

image-20211215104955005

  • DELETE/INSERT
    • Remove or add triples from/to the Graph Store based on bindings for a query pattern specified in a where clause.
  • WITH 表示数据集

  • 先做查询,在再查询里做插入删除

image-20211215105255654

Exercise

  • Given the dataset and the SPARQL query as follows, please write the query results.

Q1

1
2
3
4
5
@prefix org: <http://example.com/ns#> .
_:a org:employeeName "Alice" .
_:a org:employeeId 12345 .
_:b org:employeeName "Bob" .
_:b org:employeeId 67890 .
1
2
3
4
5
6
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX org: <http://example.com/ns#>
DELETE {?person ?p ?o .}
where {?person org:employeeId ?id .
FILTER (?id > 50000)
?person ?p ?o .}
  • 删除后:
1
2
_:a org:employeeName "Alice" .
_:a org:employeeId 12345 .

2. Q2

1
2
3
4
5
6
7
8
@prefix org: <http://example.com/ns#> .
# Graph: http://person
_:a org:employeeName "Alice" .
_:a org:employeeId 12345 .
_:b org:employeeName "Bob" .
_:b org:employeeId 67890 .
# Graph: http://person2
_:c org:employeeId 13579
1
2
3
4
5
6
7
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX org: <http://example.com/ns#>
INSERT {GRAPH <http://person2> {?person ?p ?o .}}
where {Graph: <http://person>
{?person org:employeeId ?id .
FILTER (?id < 50000)
?person ?p ?o .}}
1
2
3
4
5
6
7
8
9
10
@prefix org: <http://example.com/ns#> .
# Graph: http://person
_:a org:employeeName "Alice" .
_:a org:employeeId 12345 .
_:b org:employeeName "Bob" .
_:b org:employeeId 67890 .
# Graph: http://person2
_:c org:employeeId 13579 .
_:a org:employeeName "Alice" .
_:a org:employeeId 12345 .

2.9 CLEAR

  • Remove all the triples in the specified graph(s) in the Graph Store.

image-20211215110528044

2.10 MOVE

  • Move all data from an input graph into a destination graph.

image-20211215110550501

2.11 Solution Modifiers

  • Only for SELECT queries;

  • Modify the result set as a whole (not single solutions);

  • Keywords: DISTINCT, ORDER BY, LIMIT, OFFSET.

DISTINCT

  • Remove duplicates from the result set.

image-20211215110728620

image-20211215110734271

image-20211215110739530

ORDER BY

  • Order the results.

image-20211215110814813

  • ASC for ascending (default) and DESC (e.g., DESC(?name)) for descending.

image-20211215110909379

LIMIT

  • limits the number of result:
    • 只返回5个

image-20211215111028802

OFFSET

  • position/index of the first reported results:

image-20211215111113085

  • Order of the result should be predictable (combine with ORDER BY)

BINDINGS

image-20211215111325636

image-20211215111347379

image-20211215111357307

VALUE

  • add data to the query directly.
    • 增加限制

image-20211215111516484

AGGREGATES

  • allows for the grouping of solutions and

  • the computation of values over the groups.

image-20211215111700791

  • GROUP BY groups the solutions; (i.e., students who attend the same lecture)

  • COUNT is an aggregate function that counts the solutions within a group; (i.e., number of students in the lecture)

  • HAVING filters aggregated values
  • Question: Please use natural language to explain this SPARQL query!
    • 查选课人数超过5人的课

NEGATION:

image-20211215111838420

image-20211215111848348

  • Question: Please use natural language to explain the above SPARQL queries!
本文作者:Smurf
本文链接:http://example.com/2021/08/15/knowledge%20engineering/13.%20Knowledge%20Graph%20Querying/
版权声明:本文采用 CC BY-NC-SA 3.0 CN 协议进行许可