MaterialWays

From Knoesis wiki
Revision as of 15:54, 23 November 2015 by Sarasi (Talk | contribs) (Kno.e.sis Semantic Tools)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Introduction

Several foundational elements required to achieve Sir Tim Berners-Lee’s vision for a semantic web are in place and available to the materials community. The semantic web, sometimes referred to as the web-of-data, focuses on ontologies as well as the linking data for machine-to-machine data interchange (implemented via RDF and OWL). Linkage between multiple datasets, files and their respective metadata can be established in an ad hoc fashion without having to adhere to specific database table structures. Linked data without context is of limited value. A semantic web for materials requires common vocabularies. An example of a common vocabulary is the Dublin Core (DC) ontology, a set of universally accepted metadata used to describe a resource (e.g. document).

The development and publishing of vocabulary using RDFS/OWL is one of the initial steps required to link relevant materials information across disparate (federated) sources. The development of common vocabularies could be jump-started via crowd sourcing and curated by materials subject matter experts (SME). Additionally, collaborative efforts with professional societies and other organizations (e.g. ASTM terminology standards, CEN, ASM, TMS, etc.) could be used to accelerate vocabulary/ontology development. Over time, multiple vocabularies would likely winnow down to key sets of generally accepted terms and mappings between terms having the same meaning. Taxonomies, a form of ontology, can express simple relationships in the materials domain.

More sophisticated relationships between materials processing, structure and properties can be expressed using complex ontologies. These ontologies need to be developed and implemented using World Wide Web Consortium (W3C) recommendations like RDF/OWL or widely accepted semantic technology standards such as time.owl and DC. As the above elements are being established on a larger scale, various forms of materials informatics could be developed to greatly expand the materials data and design space for the materials scientists and engineers.

Success requires innovative approaches during the development of agents to query linked materials data, applications to mash-up and integrate data, and reasoning/inferencing engines specifically tailored to the materials domain. Machine learning and other innovative “data hungry” approaches to extract knowledge could be developed and applied for materials design.

Project Description

Federated Semantic Services Platform for Open Materials Science and Engineering This three-year project will undertake three broad classes of tasks. The first related to creating semantic infrastructure including ability to create semantic metadata for a variety of data types utilizing domain models and knowledge bases. The second relates to semantic search for all varieties of data, including resources with services based access. The third relates to development of a novel semantic data exchange scheme for materials science (termed Linked Open Materials Data) by developing an open data based approach

KDDM: Materials Database Knowledge Discovery and Data Mining Knoesis Center with the collaboration with AFRL/RX applying knowledge and technology in informatics to the material domains, thus introducing the materials and process community to better data management practices. A data exchange system that will allow researchers to index, search, and compare data will enable a shortened transition cycle in material science which is usually takes 5 to 10 years. This multi-disciplinary project seeks to span informatics and material science to fill this gap.

Complementary Activities Undertaken by Others

European Committee for Standardization (CEN)

  • A Guide to the Development and Use of Standards-compliant Data Formats for Engineering Materials Test Data CEN 17 March 2010 - The engineering community invests significantly in generating materials test data of an high inherent value. Very often the data sets are richly structured and amenable to reuse. The materials community has, however, largely failed to address the issue of data capture and preservation. Although technologies for the automated capture and preservation of test data exist, on the rare occasions that data are conserved, they are invariably inaccessible to the wider materials community. This inevitably acts as an obstacle to the research process, and hinders business activities in the engineering sector. In recognition of these issues, CEN (the European Committee for Standardization) sponsored the 12-month ELSSI-EMD Workshop to develop schemas and ontologies derived from procedural materials testing standards.
  • Workshop Agreement Sep 2010
  • Business Plan for the CEN Workshop on Standards for Electronic Reporting in the Engineering Sector (WS SERES) 22 March 2012 - The proposed new CEN Workshop SERES builds on the success of CEN Workshop ELSSIEMD (Economics and Logistics of Standards-compliant Schemas and ontologies for Engineering Materials Data), acting on the key recommendations reported in CWA 16200:20101, and leveraging procedural Standards in he engineering materials sector to encourage stakeholder engagement and buy-in. The period of performance for the workshop is from March 2012 through November 2013 (21 months)
  • The SERES draft CEN WORKSHOP AGREEMENT (CWA) 'ICT Standards in support of an eReporting framework for the Engineering Materials Sector' and its Annexes, are under public review as from 23 December 2014 till 23 February 2014.
    • Draft Report - CEN Workshop Agreement — CEN/WS SERES — ICT Standards in Support of an eReporting Framework for the Engineering Materials Sector
    • Annexes - CEN Workshop Agreement — CEN/WS SERES — ICT Standards in Support of an eReporting Framework for the Engineering Materials Sector
    • Draft Ontologies - CEN Workshop Agreement — CEN/WS SERES — ICT Standards in Support of an eReporting Framework for the Engineering Materials Sector

NIST

ISO

  • ISO/Technical Committee (TC) 184 – Automation Systems and Integration
  • TC 184 - Business Plan 20 Sep 2004.
    • Subcommittee 4 (SC4) - Industrial Data While standards curated by SC4 focus on “products,” many of the standards are applicable to the materials domain as well.
      • ISO 10303 (1999/2010) - Industrial automation systems and integration - Product data representation and exchange. Informally known as STEP which stands for “Standard for the Exchange of Product model data”. Consists of several hundred parts. STEP is the largest standard within ISO. ISO 10303 Application Areas include CAD, CAM, Product Data Management, Process Planning... NIST involvement seemed to begin around 1999 they were active in “Future STEP” in 2010
    • SC5 - Interoperability, integration, and architectures for enterprise systems and automation
      • ISO 10303-45:2008 - Integrated generic resource: Material and other engineering properties
      • ISO 10303-45:2008 2nd Edition - : Integrated generic resource: Material and other engineering properties. Specifies the integrated generic resources to describe the values of the material and other engineering properties of products, as well as the conditions in which these property values are valid. 10303-45:2008 also includes the resource constructs for describing the composition of products. Property and composition values can be identified either as a numerical value or as a mathematical function. Numeric values and values as mathematical functions can be further characterized as to their type, precision and uncertainty. Many of the following elements are needed for ICMSE:
        • specification of material properties
        • association of material properties to a product
        • qualitative and quantitative conditions for the validity of material properties, and material properties for the faces of products
      • ISO 10303-235:2009 - Application protocol: Engineering properties for product design and verification. ISO 10303-235:2009 specifies the use of the integrated resources necessary for the scope and information requirements for the representation of engineering property data that are used for product design and product validation.
      • The following are within the scope of ISO 10303-235:2009 and many of these characteristics are needed for ICMSE data:
        • descriptions and definitions of the manufactured product, the sample of the product and the testable version of the sample;
        • description of the composition and substance of the product;
        • description of the processes used in the measurement;
        • descriptions of the data values produced by the measurement, with the specification of the conditions in which the data is valid;
        • references to standards and other documents wherein sampling, measurement and other details of testing and measurement processes can be specified or described;
        • descriptions and qualifications of the personnel and or organizations responsible for the measurement;
        • specification of the requirements, conditions and tolerances to be satisfied in the measurement and a description of the outcome;
        • descriptions of the locations of the measurement process and the effectivity of the results;
        • descriptions of the approval that establishes the validity of the measurements and the use of the properties for product design and design validation.

Use Cases

Discovery

While the primary objective of linked data is to enable machine-to-machine interoperability, it’s generally considered a best practice to expose the same data for human consumption. It’s easy to envision tools that allow a seasoned materials expert as well a student the ability to “follow-their-nose” through rich sets of metadata and data describing products and processes within the materials domain. As they fluidly traverse the network combing through a combination of structured and unstructured data and information, they are likely to gain awareness and understanding more readily than through the retrieval and perusal of documents returned via keyword searches. An additional advantage is the ability to capture structured data along the way and save for deeper analysis at a later date. Google’s Knowledge Graph, Bing’s Snapshot, Facebook’s Graph Search all exemplify steps in this direction.

Mining, Mash-ups, Analysis and Analytics

(This section needs a description about how the semantic web can be used to discover data relative to a specific query) The ability to garner and mash-up structured data from queries eases entry into many types of analysis and analytics. For example, if a structural integrity engineer wanted to determine the effect of humidity on the rate of fatigue crack growth in various aluminum alloys, he could first attempt to garner that data using a SPARQL query to retrieve the relevant data and metadata: the aluminum designation (e.g. 7075), heat treatment (e.g. T6), product form (e.g. plate), crack growth rate, stress intensity, and test humidity. The search could even be expanded to include the test date and location in order to establish gross categories for humidity levels where humidity wasn’t overtly controlled, monitored or reported. For example, crack growth tests in Dayton during winter months have a higher probability of low humidity levels versus those performed during the summer months. Without linked data and a common vocabulary, the time to discover and transform data into a useful form could prove to be too much for many scientists and engineers. Easing the aggregation of data for further analysis or analytics will likely have a large positive impact on the materials community.

Design and Optimization of Known Materials

At the core of materials design is the desire to enhance a materials property (e.g. strength) to meet a performance criteria (e.g. minimum required strength) through a causal understanding of how the properties are influenced by the composition (e.g. carbon fiber, polyimide matrix), processing (e.g. autoclave) and processing’s influence on structural morphology (e.g. porosity).

Generally, materials design relies heavily on a sequence of computation and experiment. Models are developed, typically applied via software, and used to predict the response of a material (e.g. strength). Experiments are undertaken to validate models or explore the response of materials when accurate models aren't available. Sometimes, results from experiments are used to create empirical models of the materials response.

In any case, model input parameters are related to model output parameters. Can the relationships be captured using binary triplets and assertions associated with each triplet element? Many of the relationships can indeed be captured! However, can they then be expressed using OWL in such a way as to be amenable to inference and reasoning? If the answer is yes, then it may be possible for the S&E to find new relationships between parameters that will ultimately yield higher materials performance or some other parameter assessed during design.

Development of a Shared Materials Vocabulary

Guidance for Creating Vocabularies

Units of Measure

At some point we'll want to include "units" for the terms in the vocabulary. A couple of sources of information:

As OntoML does not cover units and quantities to the extent that is required within eCl@ss XML 1.0, an additional format named unitsML is included.

Vocabulary Sources

Current Draft Vocabulary via SPARQL Endpoint

Namespace: [ http://knoesis.org/matvocab/ ]

Moving to this namspace: [ http://matvocab.org/vocabulary/ ]

MatVocab Wiki (not currently resolving URL due to URL change)

Examples:

Milestones

Date Milestone
Sep 2013 Glossary received from ASM
Oct 2013 ASM Handbook 21 Glossary accessible via URL
Oct 2013 MIL-HDBK-17 terms accessible via URL
Apr 2013 MIL-HDBK-5 terms accessible via URL
Apr 2013 Add MatVocab wiki to enable user to search for terms

Development of a Materials Ontology

Existing Materials Ontologies or Schemas

Existing Ontologies

  • MatOnto 'How can we get a copy of this ontology?'
    • Towards an Ontology for Data-driven Discovery of New Materials (.pdf)‎ Materials scientists and nano-technologists are struggling with the challenge of managing the large volumes of multivariate, multidimensional and mixed-media data sets being generated from the experimental, characterisation, testing and post-processing steps associated with their search for new materials. In addition, they need to access large publicly available databases containing: crystallographic structure data; thermodynamic data; phase stability data and ionic conduction data. Materials scientists are demanding data integration tools to enable them to search across these disparate databases and to correlate their experimental data with the public databases, in order to identify new fertile areas for searching. Systematic data integration and analysis tools are required to generate targeted experimental programs that reduce duplication of costly compound preparation, testing and characterisation. This paper presents MatOnto – an extensible ontology, based on the DOLCE upper ontology, that aims to represent structured knowledge about materials, their structure and properties and the processing steps involved in their composition and engineering. The primary aim of MatOnto is to provide a common, extensible model for the exchange, re-use and integration of materials science data and experimentation. (circa 2008 or 2009)
  • MatSEEK
    • MatSEEK: An Ontology-Based Federated Search Interface for Materials Scientists (Feb 2009)
    • The MatSeek system is an ontology-based federated search interface to key materials science databases and analytical tools. By combining Semantic Web and Web 2.0 technologies, MatSeek provides materials scientists with a single Web interface that enables them to search across disparate databases containing crystal-structure data, ionic-conductivity data, and phase stability data; render 3D crystal-structure images; calculate bond lengths and angles; retrieve relevant scholarly references; and identify potential new materials with the structure and properties required to satisfy specific applications. The MatOnto ontology underlying MatSeek enables integration of data across disparate databases, and Web 2.0 technologies enable iterative searching across the databases. The results retrieved from searching the previous database are used as input to the query on the next database. By providing materials scientists with a single, integrated Web interface to the critical materials science databases and analytical tools, MatSeek represents a significant advance toward a full-fledged materials-informatics workbench.
  • MASON Can't open it with Protege
  • Plinius
    • The Plinius ontology of ceramic materials covers the conceptualisation of the chemical compositionof materials. The design decisions underlying ontology development at our group are discussed.The ontology of ceramic materials is given as a conceptual construction kit, involving several sets ofatomic concepts and construction rules for making complex concepts. One of its implementations,that in Ontolingua, is presented. source
  • CEN SERES Workshop
    • Draft Ontologies - CEN Workshop Agreement — CEN/WS SERES — ICT Standards in Support of an eReporting Framework for the Engineering Materials Sector

Related Ontologies

  • Chemical Methods Ontology (CMO) The chemical methods ontology describes methods used to
    • collect data in chemical experiments, such as mass spectrometry and electron microscopy
    • prepare and separate material for further analysis, such as sample ionisation, chromatography, and electrophoresis
    • synthesise materials, such as epitaxy and continuous vapour deposition

It also describes the instruments used in these experiments, such as mass spectrometers and chromatography columns.

Existing Schemas

  • EC MatDB
    • XML Related MatDB Tools for Data Exchange and Interoperability (proceedings) The web-enabled materials properties database MatDB of the European Commission Joint Research Centre (EC-JRC) is a database application for the storage, retrieval and evaluation of experimentally measured materials data coming from European R&D projects. Data exchange and interoperability are important database issues to reduce costs of expensive material tests. Many organizations world-wide are participating in the development of GEN IV reactors. To reduce costs the GEN IV International Forum has agreed to interoperate and exchange data for the screening and qualification of candidate materials. To simplify the complexity of data mapping between differently structured databases, adoption of a standardized XML schema is the favored option. The paper focuses on MatDB XML related tools and items: • Upgrade, extension and implementation of the MatDB XML schema within a planned US/EC cooperation; • European standardization activities for data exchange, interoperability and the development of standard formats for engineering materials data; • MatDB data cite participation.
    • EC MatDB Schema
Reviews and Synopsis

These schemas are at various stages of development, each with their own benefits and limitations. Most promising appears to be the EC MatDB schema, about which you can read more.

Approaches for Ontology Development

The P^3-Triplet Approach

The materials domain consists of millions of concepts. Many are fairly static and well understood while others are evolving at a rapid pace. Intuition suggests that the materials design and development domain’s current state of chaos and intrinsic heterogeneity would benefit from a bit of structure. One conceptual construct that moves in that direction is Product-Process-Product Triplet. Triplets focus a smallish number of subject matter experts (SME) onto three naturally related materials domain entities: one or more materials or products that are subjected to a materials or fabrication process to yield a higher-value material or product. For example, epoxy resin and carbon fiber are subjected to a prepreg manufacturing process to yield a roll of prepreg material. Do we know if this construct will yield what the materials domain experts want? Not at this time, and there are other approaches for engineering a domain vocabulary or ontology. That said, P3-Triplets do have some inherent qualities that are compelling:

  • Each triplet is generally aligned with a subdomain of subject matter expertise. For example, a materials R&D organization of 500 scientists and engineers may only have a handful of processing experts for polymeric matrix composites.
  • Elements of each triplet seem to be generally aligned with protected information (e.g. intellectual property rights) which may ease the implementation of access control. For example, the properties associated with a composite material are generally made available to the commercial community; however, the processing steps used to create the composite may be closely held.
  • They enable a means to link the entire breadth of processes beginning with the extraction of raw material to the final process for a finished product.
  • Specific “important” product or processing activities can be expressed preferentially and made available for use. That is, you don’t have to build the entire skyscraper before someone can work in it.

Triplet Anatomy

Elements of a P3-triplet can themselves be considered to be elements of an RDF triple:

  • Subject - the “input” material or product
  • Predicate - the materials or manufacturing process
  • Object - the “output” material or product

Additionally, each triplet element consists of any number of relationships or assertions and generally take the form of an RDF triple. The assertions strive to express the important relationships between various parameters at the schema and instance levels. However, like Star Trek’s tribbles, triplets and their assertions can grow exponentially. Whether they become a troublesome mess or something that helps reduce chaos in the materials community remains to be seen.


Triplet.png

A series of P^3-Triplets for a Polymeric Matrix Composite (PMC). Note the product of a process results in a product that may be used as input for another process; therefore, the triplets overlap.

CompositeTripletSeries.PNG
  • Ontology Concept Elicitation Tool (OnCET) Development This tool is being designed and developed to directly elicit ontological concepts and their relationships from the user. The user provides the relationship (predicate) between two parameters or entities (concepts). The source for these relationships can be the user's subject matter expertise or captured while the user is reads a document from the materials and manufacturing domain.
  • Visualize using iExplore iExplore can be used to visualize the statements that were created using OnCET. Click here to access information about iExplore and view a demonstration.

Milestones

Date Milestone
dd mmm yyyy OnCET convert to web-based application
6 Jan 2014 OnCET deployed to Knoesis server
17 Feb 2014 OnCET able to export Triplets and assertions (triples) to .csv
27 Feb 2014 iExplore able to use OnCET RDF
3 Apr 2014 Added capability for multiple process inputs


Materials Data

Data Models

  • A significant amount data used for materials development and usage is tabular in nature. One approach being explored is the use of W3C's Data Cube (June 2013) coupled with NASA's QUDT ontologies. QUDT is being co-developed by NASA-Ames and TopQuadrant .

Materials Databases

  • European Commission Joint Research Centre
    • EC MatDB
  • National Institute for Aviation Research (NIAR)
  • MatWeb, Your Source for Materials Information The heart of MatWeb is a searchable database of material data sheets, including property information on thermoplastic and thermoset polymers such as ABS, nylon, polycarbonate, polyester, polyethylene and polypropylene; metals such as aluminum, cobalt, copper, lead, magnesium, nickel,steel, superalloys, titanium and zinc alloys; ceramics; plus semiconductors, fibers, and other engineering materials. There are over 59,000 data sheets in the collection.
    • Subjects Include: Aluminum, Ceramics, Materials Science, Metals, Nylons, Polycarbonate, Polyester, Polymers, Steels, and Titanium
    • Publisher: Automation Creations, Inc.

Useful References

Process Modeling

Data Modeling, Feature Identities, Descriptors and Handles with Philip Sargent Cambridge, UK

Modelling Materials Processing:An overview by Philip Sargent, Cambridge, UK

Materials Information and Conceptual Data Modelling

Msc

Non-structured Materials Science Data Sharing Based on Semantic Annotation

Towards an Ontology for Data-driven Discovery of New Materials

Integrated Computational Materials Engineering (ICME)

Ontology

Related Links

Read latest news in Material Science here: Materialstoday

Kno.e.sis Semantic Tools

Contact us

knoesismat@gmail.com

Generally Useful Information

Keyboard Shortcuts

  • <shift><click> on a link will open the linked page in a new window

Software Tools

  • Graphical representations (saved as pdf) of the schemas were created using QXmlEdit