From Knoesis wiki
Revision as of 03:26, 15 February 2016 by Nishita (Talk | contribs) (Important Queries)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

KnowledgeWiki A semantic platform for creating, integrating and curating knowledge graphs. Federated Semantic Services Platform for Open Materials Science and Engineering : Building a semantic infrastructure to create domain models for material science domain, provide semantic search over the multi model data sources and enable the exchanging of material science data via Linked Data.


Graduate Students: Nishita Jaykumar, PavanKalyan Yallamelli, Vinh Nguyen, Sarasi Lalithsena
External Collaborators: Clare Paul AFRL/RX PI
Faculty: Amit P. Sheth (Advisor), Krishnaprasad Thirunarayan
Past Members: Kalpa Gunaratna, Swapnil Soni, Siva Kumar Cheekula and Mary Panahiazar


The White House’s Materials Genome Initiate (MGI) seeks to substantially improve the process of new material discovery and development, and shorten the time to deployment. Two of the core components of this initiative - new and sophisticated computer modeling technologies and next-generation experimental tools - received initial federal research support through 2012. The third major component is that of developing solutions for broader access to scientific data about materials to aid in achieving the goal of faster development of new materials at lower costs.

Our approach recognizes the need for providing easy access to large amounts of highly distributed and heterogeneous data – including unstructured (scientific literature or publications), semi-structured and structured data. We recognize the need to support a variety of data as well as resources that provide data using APIs and Web services. We recognize the need for tools to be able to easily exchange data. We also recognize the need for integrated provenance (i.e., data lineage) to support data quality and relevance, and access control for organizations to share information when desired and yet keep valuable intellectual property confidential. To address these requirements, we will use recent advances in semantic web (standards, search and query processing techniques and tools, Web of Data or Linked Open Data) and semantic services computing, along with integral support for provenance and access control. In a complementary effort during the first year, the development of domain models and knowledge bases (ontologies, taxonomies, and vocabularies), will be carried out with support from ARFL’s Materials and Manufacturing Directorate.

This three-year project will undertake three broad classes of tasks. The first related to creating semantic infrastructure including ability to create semantic metadata for a variety of data types utilizing domain models and knowledge bases. The second relates to semantic search for all varieties of data, including resources with services based access. The third relates to development of a novel semantic data exchange scheme for materials science (termed Linked Open Materials Data) by developing an open data based approach.


We identified that Semantic Mediawiki is the ideal tool for the task at hand. There are 11 Templates and 28 properties that we determined to be necessary for capturing the data in Material Science.

This section is mainly concentrated on overall architecture of the system, development of new extension using Singleton template, representing the provenance metadata of the triples and algorithm for identifying the singleton templates for any given RDF dataset.

Overall Architecture

This section explains the data collection via the existing semantic forms,integrating singleton property template in SMW , new data representation and representing entity, creating triple and representing each entity as wiki page.

Data collection, Data representation and Data management are the three phases used to collect the data, representing the data using singleton template and creating triples and pages in wiki.

Singleton Template Extension

In mediawiki templates are used as a simplest way for including the markup. Singleton template extension was implemented in order to represent the provenance information of RDF triples. This extension was seamlessly incorporated into the existing extension.

Data Model

In this section we discuss the details of the data model and vocabulary modelling. The following table describes the details of the templates


No: Template Name Resource Link
1 Definition Text
2 Definitions on Other Websites
3 Name, Abbreviations, Symbols, Synonyms, and Units
4 Image
5 Video
6 Sound
7 Equation
8 Code Snippet
9 Source Code
10 Related Information



No: Template Name Resource Link
1 skos:definiton
2 dcterms:source
3 mv:sourceType
4 mv:sourceURL
5 dcterms:license
6 dcterms:creator
7 mv:sourceType
8 rdfs:isDefinedBy
9 rdfs:comment
10 rdfs:label
11 vaem:abbreviation
12 mv:symbol
13 qudt:unit
14 schema:image
15 schema:video
16 mo:recording_of
17 xhv:math
18 mv:codeSnippet
19 schema:programmingLanguage
20 rdfs:comment
21 schema:programmingLanguage
22 rdf:seeAlso
23 dcterms:references
24 dcterms:bibliographicCitation
25 dcterms:identifier
25 mv:synonym

Property Template Approach

  • This section discusses the details of the steps taken to automatically create the wiki pages for the entities in YAGO dataset
  1. Identify a list of regular properties
  2. Identify a list of Generic properties
  3. Create one page per property  
    1. For each property check the count of datatypes it has (using group by query)
    2. if it has only datatype, map that dt to the wikidata dt (create the has type : type)
    3. else create an empty property
    4. if the object is URI then the datatype is Page
  4. Create a list of regular templates
    1. The name of the template is taken from the  name of the property
    2. generate the regular template tag.
  5. Create a list of singleton templates
    1. The name of the template is taken from the generic property
    2. Add a UUID property for existing value of singleton property
    3. Generate the meta-template tag/code for each template
  6. For prperties created from yago, add another property to capture the data of the original property (make a link from this property to the original property to link to yago dataset)
  7. Virtuoso configuration.
  8. Analyze the statistics of the concept

Important Queries

  • Get all the regular properties (triples with regular properties):
prefix rdf: <>
SELECT distinct ?p
  ?s ?p ?o .
  FILTER (NOT EXISTS {?s rdf:singletonPropertyOf ?x . })
  FILTER (NOT EXISTS {?p rdf:singletonPropertyOf ?y . })
  FILTER (NOT EXISTS {?o rdf:singletonPropertyOf ?z . })
  • What are the singleton properties :
prefix rdf: <>
SELECT  distinct ?x ?s
   ?s rdf:singletonPropertyOf ?x .  
  • What are the generic properties :
prefix rdf: <>
SELECT distinct ?x
  ?s ?p ?o .
  ?p rdf:singletonPropertyOf ?x . 
  • What are all the meta-properties of the singleton property ?
prefix rdf: <>
prefix rdfs: <>
prefix exp: <>
SELECT distinct ?x ?p
  ?s ?p ?o .
  ?s rdf:singletonPropertyOf ?x .
  FILTER (?p != rdf:singletonPropertyOf )
  • Filter all literals in the graph
prefix rdf: <>
prefix rdfs: <>
select ?s ?p ?o
?s ?p ?o FILTER isLiteral(?o) .
  • Get all properties for each entity
select ?s ?p ?o
where {
?s ?p ?o
  • Get all concepts in the graph
prefix owl:<>
?a a owl:Thing
group by ?s

For more information please visit YagoDataset