From Knoesis wiki
Jump to: navigation, search

CityPulse: Real-Time IoT Stream Processing and Large-scale Data Analytics for Smart City Applications


CityPulse is a EU FP7 project consisting of a consortia of nine partners spanning academia, industry, and city partners. Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) is a partner in CityPulse project. CityPulse aspires to provide an infrastructure for real-time processing of massive multimodal observations from a city allowing citizens, city authorities, and developers to exploit this data to obtain timely information for informed decision making.

People are central to a city and their reports of city infrastructure related observations provide unprecedented opportunities to gain insights into the working of a city. Kno.e.sis as a partner leads the effort to leverage observations from people in a city to understand various aspects of city infrastructure. Specifically, extracting and understanding traffic infrastructure related events is explored due to the availability of open data.

CiytPulse Main Project Page


Traffic management is a challenging issue in most of the major cities around the world. With increasing number of people moving to cities for economic opportunities, this trend is going to make city resource management a crucial problem. We take this important problem and study its characteristics for creating solutions toward a grand vision of providing actionable information to city policy makers.

City as a Physical-Cyber-Social (PCS) System

A city may have machine sensors monitoring the Physical world reporting their observations on the Cyber world. There are citizens in a city reporting their observations (Social) of various city related activities. Real-world events manifests in physical, cyber, and social modalities. Algorithms for analyzing city events relying only on a single modality may not be able to capture the richness of the interactions in the physical world. We require the analytics algorithms to consider observations of all the three aspects (PCS) for a better understanding of the physical world leading to informed actions by citizens and city authorities.

Challenges in Processing Physical-Cyber-Social Data

A city is a complex system of systems with many heterogeneous components. With decreasing cost of sensors to monitor our environment, there are efforts to deploy sensors to monitor vehicular traffic. is an exemplar of such an effort which monitors traffic flow in San Francisco Bay Area. These sensors monitor vehicular speed and volume though various road links. These observations are then utilized to derive travel time for each links. The idea of deploying these sensors is to capture the traffic flow variations resulting from real-world events in a city such as accidents, breakdowns, bad weather, etc.


Traffic related observations are not limited to sensor observations (e.g., speed, volume). Citizens report traffic events on social streams such as twitter. Such a report of traffic events from citizens are often complementary to machine sensors e.g., accident report by citizen observation complements slow moving traffic detected by sensors. We need algorithms that can extract traffic events from such heterogeneous streams.

Complexity of Interactions

Traffic events spanning machine and citizen observations may have intricate interactions requiring us to capture these interactions. These interactions may be directed (some of it causal) or undirected (just associations). Representation to understand city events should be able to represent such interactions.

Uncertainty of Interactions

Same events in real-world may have different effects e.g., accident during a peak hour may have stronger impact compared to accident during off-peak hours. In special cases, it is possible that an accident may not cause delays depending on the location of accident. Thus, the relationship between accident and traffic delay cannot be certainly stated e.g., accident -- cause --> traffic delays. We can only state that accident may cause or most likely cause traffic delays. Representation of city traffic events should be able to capture such uncertain relationships.


Traffic conditions and events influencing traffic evolve continuously resulting in challenges for understanding traffic dynamics. Traffic dynamics refers to the variations in speed and travel time observations over time and understanding traffic dynamics refers to explaining these dynamics utilizing events reported on social data stream. We need algorithms that allow us to generate explanations for varying traffic dynamics.

Broader Research Goals

As outlined in challenges, our solution aims to address heterogeneity, complexity, and uncertainty in PCS systems. We propose a two step process to move from multi-modal and multi-sensory observations in a PCS system to actionable information. First, we need to extract events from multi-modal and multi-sensory observations. Since we have both machine sensor (numerical) and citizen sensor (textual) observations, we need deeper understanding to extract events from PCS systems. Next, we need to understand the interactions between various events. There are two levels in understanding the interactions: structure and parameters. Structure qualifies the possible interactions between various events (what variables influence the variable of interest?). Parameters quantifies the interactions between variables (by how much?).

We aspire to propose a generic solution framework to the problem of extracting and understanding PCS events.

City Event Extraction

Extracting City Traffic Events from Social Streams: Cities are composed of complex systems with physical, cyber, and social components. Current work on extracting and understanding city events mainly rely on technology enabled infrastructure to observe and record events. In this work, we propose an approach to leverage citizen observations of various city systems and services such as traffic, public transport, water supply, weather, sewage, and public safety as a source of city events. We investigate the feasibility of using such textual streams for extracting city events from annotated text. We formalize the problem of annotating social streams such as microblogs as a sequence labeling problem. We present a novel training data creation process for training sequence labeling models. Our automatic training data creation process utilizes instance level domain knowledge (e.g., locations in a city, possible event terms). We compare this automated process to a state-of-the-art tool that needs manually created training data and show that it has comparable performance in annotation tasks. An aggregation algorithm is then presented for event extraction from annotated text. We carry out a comprehensive evaluation of the event annotation and event extraction on a real-world dataset consisting of event reports and tweets collected for over four months from San Francisco Bay Area. The evaluation results are promising and provides insights into the utility of social stream for city events [Anantharam-15].

We use Open Science Framework to share project resources with the research community. Here is the link for our project on Open Science Framework.

A common problem for cities of developing countries like India in managing traffic is the lack of basic automated instrumentation to track road conditions or vehicle locations. Still, to help their citizens make informed travel decisions based on changing city dynamics; many cities have an authorized, city-initiated, notification service in place to alert subscribing commuters about road conditions. Here, alternative means may be used to create informal textual notifications – e.g., inputs from field personnel, citizen updates, and pre-authorized events from city calendar. We show that collections of such notifications, when processed with information extraction techniques, can turn them into a rich source of data for traffic managers. Specifically, we have explored the use of Short Message Service (SMS) notifications from the city of Delhi, India to show promising insights [Anantharam-13b, Anantharam-14].

City Event Understanding

Understanding speed and travel-time dynamics in response to various city related events is an important and challenging problem. Sensor data (numerical) containing average speed of vehicles passing through a road segment can be interpreted in terms of near real-time report of traffic related incidents from city authorities and social media data (textual), providing a complementary understanding of traffic dynamics. State-of-the-art research is focused on either analyzing sensor observations or citizen observations; we seek to exploit both in a synergistic manner. We demonstrate the role of domain knowledge in capturing the non-linearity of speed and travel-time dynamics by segmenting speed and travel-time observations into simpler components amenable to description using linear models such as Linear Dynamical System (LDS). Specifically, we have exlplored Restricted Switching Linear Dynamical System (RSLDS) to model normal speed and travel time dynamics and thereby characterize anomalous dynamics. We have utilized the city traffic events extracted from text to explain anomalous dynamics. We carried out a large scale evaluation of the proposed approach on a real-world traffic and twitter dataset collected over a year with promising results [Anantharam-16].

Graphical models have been successfully used to deal with uncertainty, incompleteness, and dynamism within many domains. These models built from data often ignore preexisting declarative knowledge about the domain in the form of ontologies and Linked Open Data (LOD) that is increasingly available on the web. In this paper, we present an approach to leverage such 'top-down' domain knowledge to enhance 'bottom-up' building of graphical models. Specifically, we propose three operations on the graphical model structure to enrich it with nodes, edges, and edge directions. We illustrate the enrichment process using traffic data from and declarative knowledge from ConceptNet. The resulting enriched graphical model can potentially lead to better predictions of traffic delays [Anantharam-13].


Kno.e.sis/WSU Project Coordinator


Concurrent Projects