Transforming Big Data into Smart Data: Deriving Value via harnessing Volume, Variety and Velocity using semantics and Semantic Web
Big Data has captured much interest in research and industry, with anticipation of better decisions, efficient organizations, and many new jobs. Much of the emphasis is on technology that handles volume, including storage and computational techniques to support analysis (Hadoop, NoSQL, MapReduce, etc), and the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity. However, the most important feature of data, the raison d'être, is neither volume, variety, velocity, nor veracity -- but value. In this talk, I will emphasize the significance of Smart Data, and discuss how it is can be realized by extracting value from Big Data.
Here is how I would define Smart Data:
It provides value from harnessing the challenges posed by volume, velocity, variety and veracity of big data, in-turn providing actionable information and improve decision making.
Another way to look at Smart Data is:
Smart data is focused on the actionable value achieved by human involvement in data creation, processing and consumption phases for improving the Human experience.
Creating Smart Data requires organized ways to harness and overcome the original four V-challenges; and while the technologies currently touted may provide some necessary infrastructure-- they are far from sufficient. In particular, we will need to utilize metadata, employ semantics and intelligent processing, and leverage some of the extensive work that predates Big Data.
For Volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration, and discuss how this can not simply be wished away using NoSQL. Lastly, for Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships and uses them to better understand new cues in the data that capture rapidly evolving events and situations.
The first use of “Smart Data” and possibly its the first record of use was in 2004. See: http://www.slideshare.net/apsheth/smart-data-and-realworld-semantic-web-applications-2004.
Related Writings, Talks and Events
- Amit Sheth, "Smart Data - How you and I will exploit Big Data for personalized digit," al health and many other activitieskeynote at IEEE BigData 2014, Oct 29, 2014. [Abstract], [Talk]
- Amit Sheth, "Smart Data for you and me: Personalized and Actionable Physical Cyber Social Big Data," Featured Keynote at WORLDCOMP2014, Las Vegas, NV, July 21, 2014. Abstract,Slides,Video,Photos
- Amit Sheth, “Transforming Big Data into Smart Data: Deriving Value via harnessing Volume, Variety, and Velocity using semantic techniques and technologies”, IBM Distinguished Speaker Series talk, IBM Almaden Research Center, San Jose, CA, July 1, 2014. [Announcement, Video]
- Krishnaprasad Thirunarayan and Amit Sheth: "Semantics-empowered Big Data Processing with Applications", In: AI Magazine, 2014, 12 pages.
- Amit Sheth, “Transforming Big Data into Smart Data: Deriving Value via harnessing Volume, Variety, and Velocity using semantic techniques and technologies”, Keynote at the 30th IEEE International Conference on Data Engineering, Chicago, Illinois, Apr 2, 2014. [Announcement, Presentation/Details].
- Amit Sheth, "Smart Data enabling Personalized Digital Health," talk at PARC, October 30, 2013.
- Amit Sheth, "Transforming Big Data into Smart Data: Deriving Value via harnessing Volume, Variety and Velocity using semantics and Semantic Web," Keynote at the 21st Italian Symposium on Advanced Database Systems, June 30 - July 03 2013, Roccella Jonica, Italy. Also invited talks at universities in Spain and Italy in June 2013.
- Amit Sheth, "Transforming Big Data into Smart Data for Smart Energy: Deriving Value via harnessing Volume, Variety and Velocity,"Keynote at the Workshop on Building Research Collaboration: Electricity Systems. Purdue University, West Lafayette, IN. Aug 28-29, 2013.
- Krishnaprasad Thirunarayan and Amit Sheth: Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Social Applications, In: Proceedings of AAAI 2013 Fall Symposium on Semantics for Big Data, Arlington, Virginia, November 15-17, 2013