![]() |
![]() |
|
![]() |
These pages are deprecated ; please go to the pages of our new team EdelweissSemantic web tutorial: RDF, RDFS and SPARQL using CORESEupdated 12/12/2006About Applications Downloads Mailing List Documentation Tutorial Contacts For remarks or questions on this tutorial contact Fabien Gandon This semantic web tutorial gives a quick tour of RDF, RDFS, SPARQL and Rules. It was designed as a hand-on-keyboard introduction to the basics of RDF model, RDFS semantics for lightweight ontologies, SPARQL query language for RDF graph bases and production rules for knowledge factorisation in semantic web annotation bases. This tutorial uses four files:
A previous version of this tutorial (Corese V2.1) is available. Rapid reminder of the basicsRDFRDF is a triple model where every assertion is decomposed in three parts: (subject, predicate, object) for instance (tutorial.php, author, "Fabien"). The subject is URI identifying a resource. The predicate is a binary relation identified by a URI. The object is either a URI identifying a resource or a literal value. Each triple can be seen as a labelled arc and joining these arcs one obtains a graph that describes URI-identified resources and their relations. The serialization of RDF in its XML syntax is not unique i.e. the same RDF graph may be represented in different XML forms. For instance the following examples are equivalent: example 1:
<rdf:Description rdf:about="http://www-sop.inria.fr/acacia/soft/corese/tutorial.php">
<author>
<rdf:Description rdf:about="urn://inria.fr/~fgandon">
<firstname>Fabien</firstname>
</rdf:Description>
</author>
<subject>Web</subject>
</rdf:Description>
example 2:<rdf:Description rdf:about="http://www-sop.inria.fr/acacia/soft/corese/tutorial.php"> <author rdf:resource="urn://inria.fr/~fgandon" /> <subject>Web</subject> </rdf:Description> <rdf:Description rdf:about="urn://inria.fr/~fgandon"> <firstname>Fabien</firstname> </rdf:Description>example 3: <rdf:Description rdf:about="http://www-sop.inria.fr/acacia/soft/corese/tutorial.php" subject="Web"> <author rdf:resource="urn://inria.fr/~fgandon" /> </rdf:Description> <rdf:Description rdf:about="urn://inria.fr/~fgandon" firstname="Fabien" /> In these descriptions some nodes may not have a URI, they are called blank nodes; in the following example the author is a blank node.
<rdf:Description rdf:about="http://www-sop.inria.fr/acacia/soft/corese/tutorial.php">
<author>
<rdf:Description >
<firstname>Fabien</firstname>
</rdf:Description>
</author>
</rdf:Description>
XML Schema datatype may be applied to literal values, here is an example with dates: <rdf:Description rdf:about="urn://inria.fr/~fgandon"> <birthdate rdf:datatype="http://www.w3.org/2001/XMLSchema#date" >1975-07-31</birthdate> </rdf:Description> RDFSRDFS is a set of primitives to describe lightweight ontologies in RDF (it uses the RDF model and syntax) and for RDF (the ontologies are used to type resources and relations). RDFS allows us:
Here is an example declaring a class #Man sub class of #Person and #Male, and a property #hasMother sub property of #hasParent, and that is used between instances of the class #Human and instances of the class #Female.
<rdf:RDF xml:base="http://www.inria.fr/2006/12/05/humans.rdfs"
xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns="http://www.w3.org/2000/01/rdf-schema#">
<Class rdf:ID="Man">
<subClassOf rdf:resource="#Person"/>
<subClassOf rdf:resource="#Male"/>
<label xml:lang="en">man</label>
<comment xml:lang="en">an adult male person</comment>
</Class>
<rdf:Property rdf:ID="hasMother">
<subPropertyOf rdf:resource="#hasParent"/>
<range rdf:resource="#Female"/>
<domain rdf:resource="#Human"/>
<label xml:lang="en">has for mother</label>
<comment xml:lang="en">to have for parent a female.</comment>
</rdf:Property>
</rdf:RDF>
Then using this ontology (with the namespace http://www.inria.fr/2006/12/05/humans.rdfs) one could declare that #Lucas is a #Man and that his mother is #Laura. Here are three examples of possible serialization: <rdf:RDF xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns="http://www.inria.fr/2006/12/05/humans.rdfs#" xml:base="http://www.inria.fr/2006/12/05/humans.rdfs-instances" > <rdf:Description rdf:ID="Lucas"> <rdfs:type rdf:resource="http://www.inria.fr/2006/12/05/humans.rdfs#Man"/> <hasMother rdf:resource="#Laura"/> </rdf:Description>or <rdf:Description rdf:ID="Lucas"> <hasMother rdf:resource="#Laura"/> </rdf:Description> <Man rdf:about="#Lucas" />or <Man rdf:ID="Lucas"> <hasMother rdf:resource="#Laura"/> </Man> The semantics of RDFS would then allows us to infer that #Lucas is a #Person and a #Male, that #Laura is a #Female and a #Human and also that (#Lucas, #hasParent, #Laura). SPARQLFinally SPARQL is a query language for RDF i.e. a language to query triple stores. In its most usual form it uses the clause SELECT WHERE. The WHERE clause uses a triple syntax with question marks to prefix variables e.g. to retrieve instances of persons one could use the pattern ?x rdf:type humans:Person
The SPARQL output is a graph serialized in RDF/XML or a binding serialized in XML. Here is an example of SPARQL query. It retrieves the students of a university which web site is http//www.mit.edu or http//www.stanford.edu. It requests their name and optionally their firstname if available. These students must be older than 21. The result will be sorted by their names and only the first twenty results are requested.
PREFIX tutor: <http://inria.fr/2006/tutorial.rdfs>
SELECT ?student ?name ?firstname
WHERE {
?student tutor:isStudentOf ?univ .
{
{
?univ tutor:siteweb "http//www.mit.edu" .
}
UNION
{
?univ tutor:siteweb "http//www.stanford.edu" .
}
}
?student tutor:name ?name .
OPTIONAL { ?student tutor:firstname ?firstname . }
?student tutor:age ?age .
FILTER ( ?age > 21 )
}
ORDER BY ?name
LIMIT 20
When used in the SELECT clause, the keyword DISTINCT prevents redundancy in the results. In the Filter we can use comparators (<, >, =, <=, >=, !=), tests (isURI(), isBLANK(), isLITERAL(), BOUND()), regular expressions, characteristics of values (LANG(), DATATYPE(), STR()), casting operators, etc. Ontologies in RDFS/XMLUse your favourite text editor to open the file human.rdfs containing a small ontology coded in RDFS and using the XML syntax of RDF Question 1: looking at the top of the file, can you tell what is the namespace associated with this ontology? What mechanism is used to declare the namespace? answerQuestion 2: have a look at the XML structure of the file and find the different uses of the XML tagging system (open and close tags, standalone tags) answerQuestion 3: study the use of the following tags Class, Property, label, comment, range, domain, subClassOf, subPropertyOf and attributes ID, resource. In which namespaces are they defined? answerQuestion 4: according to the signatures (range and domain) of the properties age and hasBrother what are the types of the resources that can be linked by them? answerQuestion 5: look at the beginning of the file and draw the subgraph of the hierarchy containing the classes Animal, Man and Woman answerAnnotations in RDF/XMLUse you favourite text editor to open the file human.rdf containing a small set of annotations describing people and using the previous ontology. Question 1: what is the namespace associated with the instances created by the annotations of this file? answerQuestion 2: what is the namespace of the ontology used by these annotations and how is-it associated with the tags of this XML file? answerQuestion 3: looking at the file, what do we know about John? answerQuestion 4: propose as many ways as possible to declare that a person (e.g. "Stephen") exists. answerQueries using SPARQLIn this art we use the standalone version of the search engine Corese distributed as one executable ".jar" file. To run this file you need java 1.4.2 or above on your machine. Depending on the configuration of your operating system, double-clicking on the file might be enough to start the simplified interface. Otherwise open a shell window move to the directory where the .jar file is and use the command java - jar <NAME OF THE FILE.jar> to start the application. You should obtain one window with two tabs: This is a simplified interface used for testing, debugging and teaching. The "Loading and messages" tab allows you to load Ontologies in RDFS/XML, annotations in RDF/XML, rules in a XML/SPARQL-like format, and check errors while loading these files. The "Queries and bindings" tag allows you to write SPARQL queries and visualize the result. Using the first tab, load the ontology humans.rdfs and then load the annotations human.rdf. Normally you shouldn't see any error message and you can move to the "Output" tag and start writing queries. Question 1: what does the following query retrieve? Look at the first answers.
SELECT ?x ?t
WHERE
{
?x rdf:type ?t
}
answer
Question 2: adapt the previous query to retrieve all the classes. answerQuestion 3: write a new query to extract all the subsumption links between classes (subClassOf). answerQuestion 4: what does the following query retrieve? Translate this query in plain English.
PREFIX humans: <http://www.inria.fr/2006/12/05/humans.rdfs#>
SELECT *
WHERE
{
?x humans:hasSpouse ?y
}
answer
Question 5: constrain the previous query to build two queries to retrieve only males and their spouses. answerQuestion 6: what does the following query retrieve? Translate this query in plain English.
PREFIX humans: <http://www.inria.fr/2006/12/05/humans.rdfs#>
SELECT ?x ?y group ?y count ?x
WHERE
{
?x humans:hasFriend ?y
}
answer
Question 7: retrieve resources of which we know at least one parent. answerQuestion 8: retreive persons with their age if it is known. answerQuestion 9: identify adults in the base (hint: in france you are an adult when you are over 18). answerQuestion 10: modifying the query in question 9 ask if Mark is an adult. Mark's URI is http://www.inria.fr/2006/12/05/humans.rdfs-instances#Markanswer Question 11: look for all the Lecturers and request their types. How come they have several types? answerQuestion 12: retrieve all the instances that are both Male and Person. Explain the results such as Jack and Pierre. Why were they retreived? answerQuestion 13: retrieve all the instances of Lecturers or Researchers. answerQuestion 14: query the base to get all the researchers and then all the non researchers. answerQuestion 15: ask for instances of the relation hasAncestor. How comes you have results while this property is not used in the annotations (check the content of the annotations in human.rdf). answerQuestion 16: find the different meanings of the word "size" (hint: it is a label). answerQuestion 17: find synonyms of the word "person". answerQuestion 18: ask for the translation of "shoe size" in French. answerQuestion 19: ask everything about Laura and use the ontology to get the english labels of the properties. answerQuestion 20: use the describe clause to obtain a describtion of Laura. answerQuestion 21: construct all the triples asserting Man instances using known men and known male persons. answerRunning rulesUse your favourite editor to open the file human.rul. For better results, reuse the initial version of human.rdf i.e. as it was before the modifications of the previous sections.The principle of these rules is very simple: for each answer found for the query in the
<cos:rule>
<cos:if>
... a condition ...
</cos:if>
<cos:then>
... a conclusion ...
</cos:then>
</cos:rule>
Question 1: what does the unique rule contained in human.rul do? answerQuestion 2: before running the rules, write and run a query to retrieve the instances of Man and check the result. answerQuestion 3: adapting the previous rule, add a new rule to define the class Woman. Check the result before and after applying the rule. answerQuestion 4: propose a rule defining the symmetry of the property hasSpouse and check the result with a query before and after running the rule. answerQuestion 5: can you think of a way to declare that hasChild and hasParent are inverse properties? Check the result. answerQuestion 6: propose a rule defining the transitivity of the property hasAncestor and check the result with a query before and after running the rule. answerQuestion 7: propose a rule defining the property hasFather from the property hasParent and check the result with a query before and after running the rule. answerQuestion 8: declare a new class Adult in the ontology and define it with a rule (hint: in france you are an adult when you are over 18). Check the result with a query. answerAPI to query CoreseCreate an instance of the Corese search engineWithout argumentsCorese corese = new Corese();
To load an ontology, use corese.RDFSLoad(filename); where filename is the path of the RDFS file you want to load. With argumentsCorese corese = new Corese("corese.properties", path);
The first argument, "corese.properties", is the name of Corese configuration file; it should be located at the same place as the other data
(rdf files, rdfs files and rul files). More information about the configuration file are available further in this
document. Exemple of path with Windows: String path = "D:/toto/corese/data"; Exemple of path with Linux: String path = "/0/user/toto/corese/data"; Execute the queryIt is possible to execute the query and get syntactic or semantic errors either as exceptions or as messages: Query with errors as exceptions
try {
RDFResult res = corese.querySPARQLWithEx(query);
} catch (JavaccParseException e) {
System.out.println(e.getMessage());
} catch (TokenMgrError e) {
System.out.println(e.getMessage());
} catch (CompileException e) {
System.out.println(e.getMessage());
}
In this example, query is a string which represents the query asked to Corese; corese is an instance of Corese. Query with errors as messageRDFResult res = corese.querySPARQL(query);
In this example, query is a string which represents the query asked to Corese; corese is an instance of Corese. <cos:error><![CDATA[Undefined Property : http://example/of/ontology#propertyNotDefined]]></cos:error> Validate the query
It is possible to validate the query before executing it.
try {
boolean validated = corese.validateSPARQLWithEx(query);
System.out.println("...query validated");
} catch (JavaccParseException e) {
System.out.println(e.getMessage());
} catch (TokenMgrError e) {
System.out.println(e.getMessage());
} catch (CompileException e) {
System.out.println(e.getMessage());
}
How to get the query validated
In this example, query is a string which represents the query asked to Corese; corese is an instance of Corese. Here are the exceptions/errors that one can get:
Handle ResultsWith a query result, the most common and practical thing to do is to just print it. Results will be printed as asked by the DISPLAY statement (in XML by default, except for CONSTRUCT and DESCRIBE where results are printed in RDF).
RDFResult res = corese.query(queryString);
System.out.println(res);
But it is also possible to get details to manage results in an other way.
RDFResult res = corese.querySPARQL(queryString);
String[] var = res.getSelectVar();
for (Enumeration en = res.getValues(); en.hasMoreElements();) {
CoreseGraph cg = (CoreseGraph) en.nextElement();
for (String varidx : var) {
CoreseConcept[] value = cg.getValue(varidx);
if (value.length != 0)
System.out.println(varidx + " = " + value[0].getLabel());
else
System.out.println(varidx + " = No Value");
}
}
Note: value is an array of CoreseConcept, because it is possible to have several CoreseConcept in a result when results are grouped. The method getLabel() returns the label of the concept; it can be a URI, a number, a String... To know if there are results, it is possible to use the function getSuccess()
RDFResult res = corese.query(...);
if (res.getSucces()) {
//do something because there is one or more results!
}
Corese configuration file: corese.propertiesCorese can be configurated with the file corese.properties. Here are explained some of the main properties:
Note: Corese can run without corese.properties; properties then take their default value. |