Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
1.1 Def:
A markup language for documents containing structured information.
用于数据交换的一种标记语言
1.2 Comparison:
1.2.1 XML:
Extensible set of tags 标签可以自定义
Content orientated 数据与格式分离
- Standard Data infrastructure 不允许出错
Allows multiple output forms 有多种输出格式
1.2.2 HTML:
Fixed set of tags 标签无法自定义
- Presentation oriented 数据与格式镶嵌
- No data validation capabilities 允许有error显示
- Single presentation 单一输出格式
1.3 XML Syntax
- empty elements can be abbreviated: e.g.
can be written as - the outermost element is called root element (there is only one)
Example:
1 | <!--版本号,编码--> |
1.4 XML Attributes:
1.4.1 EP1:
1 | <City ZIP=“210000”> Nanjing</City> |
1.4.2 EP2:
1 | <author> |
等价于
1 | <author email=“gqi@seu.edu.cn”> |
1.5 规范:
Authoring guidelines:
- All elements must have an end tag. 标签有头有尾
- All elements must be cleanly nested (overlapping elements are not allowed). 所有元素必须不能重复
- All attribute values must be enclosed in quotation marks.
- Each document must have a unique first element, the root node.
- 大小写敏感
Exercise:
1 | <book> |
1.6 XML插入HTML
1 | <文章> |
1.7 XML Namespaces:
- 为了解决属性相同产生歧义而提出
1 | <h:table xmlns:h="http://www.w3.org/TR/html4/"> |
- Defining the default namespaces:
1 | <table xm1ns="http:// www.w3.org/TR/htm14/ "> |
1.8 URI format:
1.9 XML Schema:
由于XML过于灵活,所以需要定义一种规范,以便于数据交换
下面为示例代码:
1 |
|
2 RDF
2.1 Def:
对网站源数据进行标注,用于机器可读的数据交换。
The data model of Semantic Technologies and of the Semantic Web
2.2 URI
- 为了解决命名模糊问题,RDF也采用URI定义source的形式
2.3 QName
2.3.1 Def:used in RDF as shorthand for long URIs (IRIs)
- 既可以用QNames形式,也可以用URI形式
2.4 RDF Triple (Statement):
- 可以发现S P O都可能是Resource
2.4.1 Resources
Def: IRIs 类似命名空间
2.4.2 Literals
Def: 类似一个值,放在尖叫括号外
- data values;
- encoded as strings;
- interpreted by datatypes;
- treated the same as strings without datatypes, called plain literal;
- A plain literal may have a language tag;
- Datatypes are not defined by RDF, but usually from XML Schema.
- 大致意思是:literals分为有类型的和无类型的,其有类型的类型一般来自于命名空间,对于无类型的,被称为plain literals;其中plain literals又可以被语言标签标注,用于解释literals的语言类型,当然也可以不进行不标注,但注意这两种literals是不同的。
1 | <!--Typed Literals:--> |
- Does the datatype “德国” equals to “德国” @ zh ?
- Answer:不相同,因为他们位于的层次结构不同
Blank node
Def: unnamed resource or complex node (later)无名的资源或者复杂的节点,简单来说就是图上空的节点,语义较模糊的位置
- Representation of blank nodes is syntax-dependent:
underline+colon+ID (Turtle syntax): _:xyz, _:bn; 下划线加冒号加ID
2.5 RDF Syntax
2.5.1 Turtle
- list as S P O triples (easy to read)将主谓宾依次列出
- IRIs are in
IRIs在<>中,也就是sources - triples end with a full-stop .以
.
结束 - whitespaces are ignored空白可以省略
- IRIS直接表示
1 | <http://dbpedia.org/resource/Massachusets> <http://example.org/terms/captial> <http://dbpedia.org/resource/Boston> . |
- QName表示:
1 | @prefix db: <http://dbpedia.org/resource/> |
QName简化书写条例:
Grouping of triples with the same subject using semi-colon ‘
;
’; 主语相同可用;
间隔Grouping of triples with the same subject and predicate using comma ‘
,
’.主语谓语相同可用,
间隔
1 | @prefix db: <http://dbpedia.org/resource/> |
2.5.2 RDF/XML:
Def: RDF is originally designed on basis of XML (data exchange format on the Web)
a lot of tools and libraries support XML
Namespaces are used for disambiguating tags;
- Tags belonging to the RDF language come with a fixed namespace, usually abbreviated “rdf”. rdf有固定的命名空间
可以这么理解,首先要说明这个部分为RDF语句,以及声明这部分所需要使用的命名空间;
然后,定义主体描述内容:主语 谓语 宾语
- 对于
rdf:Description
的element包含对resource的描述,并被rdf:about
识别 ex:publishedBy
也蕴含了resource常常用作谓语
1
2
3
4
5
6
7
8
9
10
11<rdf:Description rdf:about="http: //semantic-web-book.org/uri"> <!--主语--><ex:title>Foundations of Semantic web Technologies</ex:title>
<!--谓语 宾语(literals)-->
<ex :publishedBy>
<!--并列谓语-->
<rdf : Description rdf:about="http://crcpress.com/uri">
<!--上一级的宾语 也是下一级的主语 此处为嵌套结构-->
<ex : name>CRC Press</ex :name>
<!--宾语(literals)-->
</rdf : Description>
</ex :publishedBy><l rdf :Description>- 对于
2.6 RDF表示N元关系
- 用一个节点中介
- 利用一个空节点
1 | <rdf :Description rdf :about="http: //example.org/Chutney"> |
2.7 RDF vs XML
- IRIs solve the problem of term meaning. IRIs解决命名重复问题
- Triple-based data model describe relations or properties among terms. RDF解决数据间的关系
Triple is good and easy to use, but cannot cover all kinds of knowledge! Semantic Web Knowledge Graph
2.8 Exercise
1 | @prefix sw: <http://www.semanticweb.org/ontology-9/> |
3. RDFs
3.1 Def
- 为RDF data提供词汇集,帮助定义RDF schema
- allows for specifying schema knowledge; ∙
- Mothers are female
- Only persons write books
- is a part of the W3C Recommendation.
- allows for specifying schema knowledge; ∙
- RDFs为RDF定义一些抽象类别词汇,以便于规范RDF的使用
- 为何不用XML Schema?
- 因为XML Schema没有语义semantics
- 因为其引用的things不能超过document
3.2 RDFS: Class and Instance
Given a triple:
ex:Semantic Web rdf:type ex:Textbook .
Instance and class names cannot be distinguished syntactically with IRIs
但是rdf不能显示表示这是一种抽象的关系
RDFS helps explicitly state that a resource denotes a class:
- rdfs:Class is the“class of all classes”.
3.3 RDFS: Class Hierarchy (Taxonomy)
3.3.1 rdfs: subClassOf is also reflexive: 自反性
- ex:Textbook rdfs: subClassOf ex:TextBook .
- ex:Book rdfs:subClassOf ex:Book .
3.3.2 rdfs: subClassOf can derive class equivalence:等价性
3.4 RDFs可以缩写
- 大致意思是可以用
rdfs:Class
同时代替rdf:Description
与rdf:type
,大致是因为rdfs直接包含了class,直接表面了该定义下为一个类
3.5 RDFS: Property and Property Hierarchy
- 可以进行简答推理
3.6 RDFS: Property Restrictions
- 谓语有值域与定义域:即主语的取值范围以及宾语的取值范围
- 谓词的值域可以进行交集与并集
3.7 RDFS: Reification
- 用空节点表示一种复杂的关系
- Represent the following sentence graphically by means of the blank node:
Wikipedia said that Tolkien wrote Lord of the Rings.
3.8 Example: Reasoning with RDFS
- Given:
1 | ex:happilyMarriedWith rdfs:subPropertyOf ex:isMarriedTo . |
- 可推出:
1 | ex:pascal ex:isMarriedTo ex:lisa . |
Exercise:
What can be inferred from the following triples using RDFS semantics?
1 | ex:Postgraduate_student rdfs:subClassOf ex:Student |
1 | ex:John rdf:type ex:Professor |