The Second International Workshop on the role of Semantic Web in Provenance Management

(Proposed to be co-located with the 9th International Semantic Web Conference, ISWC-2010)

Previous Workshop


The growing eScience infrastructure is enabling scientists to generate scientific data on an industrial scale. Similarly, the Web 2.0 paradigm is enabling Web users to create focused applications that combine data from multiple sources, popularly referred to as “mashups”, on an extremely large scale. The importance of managing various forms of apparently ancillary metadata, in addition to the primary data products of eScience, Web, and business applications is increasingly being recognized as critical for the correct interpretation of the data. In this proposal we focus specifically on metadata that describes the origins of the data. The term provenance, from the French word “provenir” meaning “to come from", describes the lineage, or origins, of a data entity. Provenance metadata is required to correctly interpret the results of a process execution, to validate data processing tools, to verify the quality of data, and to associate measures of trust to the data.

The primary objective of this workshop is to explore the role of Semantic Web and its standards in addressing some of the critical challenges facing provenance management, namely:

  • Efficiently capturing and propagating provenance information as data is processed, fragmented and recombined across multiple applications and domains on a Web scale.
  • A common representation model for provenance, underpinned by a formal theory for use by both agents and humans.
  • Interoperability of provenance information generated in distributed environments such as the Web and myGrid.
  • Tools leveraging the Semantic Web for visualization of provenance information.

Relevance and Timeliness

The scale at which data across different domains (biomedical informatics, astronomy, oceanography, and Web-mashups) is created and processed, mandates the use of automated software tools for both the processing and analysis of provenance metadata in a scalable way. The proof layer in the Semantic Web layer cake, corresponding to provenance information, has been identified as an important component for the implementation of “trust mechanisms” and effective information extraction from the Web.

Several workshops each addressing different aspects of provenance have been held, such as Provenance in Databases, Provenance in Scientific Workflows, and IPAW (2010), but none of these workshops have specifically addressed the role of Semantic Web in provenance management. Further, the recent funding (by NSF and NIH, respectively) of large eScience projects such as the Semantic Provenance Capture in Data Ingest Systems (SPCDIS), and Semantic Problem solving Environment for T.cruzi makes this workshop timely and relevant. The recently approved IEEE Internet Computing special issue on “Provenance in Web applications in Business, eScience and Social Networking” and the Journal of Web Semantics special issue on provenance and Semantic Web establishes increasing importance of provenance management for computer science researchers.


The workshop anticipates the participation of researchers in academia, industry, and government involved in both provenance management and Semantic Web. Given the focus of this workshop on real world eScience, Web, and business projects, we expect domain scientists, Web technologists, and researchers in industry who are interested in provenance management to actively participate in this workshop. The workshop also aims to raise awareness among provenance researchers about Semantic Web and correspondingly highlight provenance management as a rich problem domain for Semantic Web researchers.

Workshop Format

Invited Talk

Paper Presentations

The workshop solicits the submission of original research papers dealing with analytical, theoretical, and practical aspects of provenance management using Semantic Web. Topics of interest include, but are not restricted to:

  • Representation models for provenance, provenance ontologies
  • Provenance analysis (reasoning, knowledge discovery, user-defined rules)
  • Annotation of scientific data using provenance ontologies
  • Role of provenance in social networks, social media and Web 2.0 (mashups)
  • Interoperability and propagation of provenance across applications
  • Large scale storage and efficient querying of provenance
  • Provenance infrastructure for eScience, business, and Web applications
  • Role of provenance in scientific data management

Duration of the Workshop

The workshop is scheduled to be a full-day meeting.



  • Amit Sheth
Amit Sheth is an educator, researcher, and entrepreneur. He is the LexisNexis Ohio Eminent Scholar for Advanced Data Management and Analysis and the director of Kno.e.sis Center at the Wright State University. He has some of the best cited papers (h-index 58) in information integration, workflow management, Semantic Web and semantic web services, and his research interests includes semantics-empowered sensor and social computing on the Web. His research has led to two companies and many deployed systems and applications. http://knoesis.org/amit
  • Carole Goble (proposed)
Carole Goble is a full professor in the the School of Computer Science in the University of Manchester, UK, where she has co-led the Information Management Group since 1997.She has worked closely with life scientists for many years and is the Director of the myGrid project, the largest UK e-Science pilot project, which has produced the widely-used Taverna open source software and is now part of the Open Middleware Infrastructure Institute UK. She is also the co-director of the e-Science North West regional centre. Carole has an international reputation in the Semantic Web, e-Science and Grid communities and has led the application of Semantic Web technologies to both the Grid and e-Science, a fusion dubbed the Semantic Grid. She has produced the first reference architecture for the Semantic Grid (S-OGSA) through the Ontogrid project and chairs the Open Grid Forum Semantic Grid Group, along with David De Roure. http://www.cs.man.ac.uk/~carole/

Organizing Committee/PC Co-Chairs

  • Satya S. Sahoo, Kno.e.sis Center, Wright State University
Satya Sahoo is a researcher and doctoral student at the Kno.e.sis Center, Wright State University. His research interests include semantic provenance, knowledge representation, and information integration in biomedical and sensor domains. He has defined a formal logic-based provenance management framework for scientific data (part of the NIH-funded project Semantic PSE for T.cruzi).
Further details are at: http://knoesis.wright.edu/researchers/satya, Email: satyasahoo@ieee.org
  • Jun Zhao, University of Oxford
Jun Zhao is an EPSRC Postdoctoral Fellow at the Department of Zoology at the University of Oxford
Further details are at: http://users.ox.ac.uk/~zool0770/

Steering Committee

  • Paolo Missier, University of Manchester, UK
  • Juliana Freire, University of Utah (proposed)

Program Committee (proposed)

  • Aleksander Slominski, IBM Research
  • Bertram Ludäscher, University of California Davis
  • Beth Plale, Indiana University
  • Claudio Silva, University of Utah
  • Francisco Curbera, IBM Research
  • Giorgos Flouris, FORTH-ICS, Greece
  • Ilkay Altintas, San Diego Supercomputer Center, UCSD
  • James Cheney, University of Edinburgh
  • Jun Zhao, Oxford University
  • Kei Cheung, Yale University
  • Krishnaprasad Thirunarayan, Wright State University
  • Luc Moreau, University of Southampton
  • Nirmal Mukhi, IBM Research
  • Olivier Bodenreider, National Library of Medicine, NIH
  • Paulo Pinheiro da Silva, University of Texas at El Paso
  • Peter Fox, Tetherless World Research Constellation, RPI
  • Roger Barga, Microsoft Research
  • Sarah Cohen-Boulakia, Universite Paris-Sud
  • Sudha Ram, Arizona State University
  • Val Tannen, University of Pennsylvania
  • Yogesh Simmhan, Microsoft Research

Submissions of Papers

Submissions and reviewing will be handled using the EasyChair reviewing system. Submitted papers will be refereed by at least three members the Program Committee. Accepted papers will be published as CEUR Workshop Proceedings and also made available to attendees on an electronic media (either CD or USB stick).

All submissions should be maximum 6 pages long (in IEEE format http://www.ieee.org/web/publications/pubservices/confpub/AuthorTools/conferenceTemplates.html) in PDF format.

Please submit your paper using the EasyChair site at: http://www.easychair.org/conferences/?conf=swpm2010

Important Dates

  • Submissions due: to be announced
  • Notification: to be announced
  • Camera ready papers due: to be announced
  • Workshop Date: to be announced