Obvio

From Knoesis wiki
Revision as of 09:10, 4 February 2016 by Jibril (Talk | contribs) (RS-DFO Hypothesis)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Obvio (spanish for obvious) is a graph-based framework for exploring biomedical literature to facilitate Literature-Based Discovery (LBD) based on rich knowledge representations. Its broader goal is to uncover hidden and complex associations between concepts in biomedical texts. To achieve this, Obvio utilizes several tools and resources developed at the National Library of Medicine (NIH-NLM), including MetaMap, SemRep, MEDLINE, SemMedDB, MeSH, UMLS, BKR and the UMLS Semantic Navigator. Obvio has resulted in the rediscovery of 8 out of 9 existing discoveries from scientific literature. The project encapsulates the PhD Dissertation<ref>D. Cameron, A Context-Driven Subgraph Model for Literature-Based Discovery, Ph.D. Thesis, Wright State University, 2014</ref> (video on YouTube) by Delroy Cameron, presented on August 18, 2014.

People

Graduate Students: Delroy Cameron, Swapnil Soni, Nishita Jaykumar, Vishnu Bompally
External Collaborators: Thomas C. Rindflesch, Ramakanth Kavuluru, Olivier Bodenreider
Faculty: Amit P. Sheth (Advisor), Krishnaprasad Thirunarayan
Past Members: Pablo N. Mendes, Tu Danh, Sreeram Vallabhaneni, Hima Yalamanchili, Drashti Dave

PhD Dissertation

Obvio was presented as the core of the PhD Dissertation by Delroy Cameron in the Summer 2014. The dissertation defense (and dissertation proposal) videos are available on YouTube. The dissertation presentation is also available on SlideShare.


Overview

Obvio is driven by assertions extracted from biomedical literature (called semantic predications) as well as statements obtained from structured knowledge sources (such as the UMLS and MeSH). Semantic predications are extracted from MEDLINE using SemRep and made publicly available through the Semantic MEDLINE Database (SemMedDB). These semantic predications can be used for various tasks. Some of these include: 1) Information Retrieval; 2) Question Answering (QA); 3) Document Summarization and 4) Literature-Based Discovery (LBD). Obvio uses the semantic predications specifically for Question Answering and LBD.

Question Answering

Reachability

Semantic predications were first used in Obvio for biomedical QA based on data from the 2006 TREC Challenge. The approach was based on the notion of reachability, to determine whether documents that answer complex biomedical questions could be meaningfully connected using assertions from the literature. Structured background knowledge was used to gain additional insights to connect biomedical texts, when semantic predications alone proved insufficient. The presentation below, together with our paper<ref>D. Cameron, R. Kavuluru, O. Bodenreider, P. N. Mendes, A. P. Sheth, K. Thirunarayan, Semantic Predications for Complex Information Needs in Biomedical Literature, 5th International Conference on Bioinformatics and Biomedicine (BIBM2011), Atlanta GA, November 12-15, 2011 (acceptance rate=19.4%)</ref> in BIBM 2011 on applying predications and background knowledge for QA, provide more details on this approach.

Literature-Based Discovery (LBD)

Rediscovery

The semantic predications were subsequently used for Literature-based Discovery (LBD). Specifically, they were used to determine whether existing knowledge from scientific literature, could be effectively recovered. We developed a graph-based approach that was successfully applied to rediscover and decompose Don R. Swanson's Raynaud Syndrome - Dietary Fish Oils Hypothesis (RS-DFO) from 1986. Much of the early research aimed at rediscovering Swanson's Hypotheses focused on distributional statistics and Information Retrieval (IR) techniques, such as term and concept co-occurrence to find intermediates. Only recently has significant attention been devoted to semantics-based techniques that exploit the meaning of associations between concepts. While generally more intuitive, the feasibility of such semantics-based approaches has not been fully established. Our article published in JBI<ref>D. Cameron, O. Bodenreider, H. Yalamanchili, T. Danh, S. Vallabhaneni, K. Thirunarayan, A. P. Sheth, T. C. Rindflesch, A Graph-Based Recovery and Decomposition of Swanson’s Hypothesis using Semantic Predications, Journal of Biomedical Informatics 46(2): 238-251, (2013). ScienceDirect, PMID </ref> shows that semantics-based techniques can effectively be used for recovering and decomposing Swanson's Raynaud Syndrome-Fish Oil hypothesis using semantic predications, background knowledge and graph algorithms. It is reasonable to expect that if semantics-based techniques are adequate for rediscovering existing knowledge, they ought to be sufficient for discovering new knowledge.

RS-DFO Hypothesis

The following presentation gives more details about the approach for knowledge rediscovery and decomposition. Various datasets and experimental results are also provided.

Datasets and Experimental Results
  1. Dataset
    1. Baseline (B1)
      1. Original PDFs of the 65 articles cited by Swanson's RS-DFO paper (30.5MB)
      2. ASCII text with end-of-line text wrapping fixed
      3. Text in Medline format for parsing by SemRep
      4. SemRep Relations Output
      5. SemRep Relations Output (vascular reactivity)
      6. SemRep Extracted Predications
      7. Manually Identified Predications (vascular reactivity)
    2. Baseline (B2)
      1. Titles and abstracts of the 65 articles cited by Swanson's RS-DFO paper in Medline format for parsing by SemRep
      2. SemRep Relations Output
      3. SemRep Relations Output (vascular reactivity)
      4. SemRep Extracted Predications
      5. Manually Identified Predications (vascular reactivity)
  2. Experimental Results
    1. Association-Subgraph Comparisons (Experiment I)
    2. Association-Subgraph Comparisons (Experiment II)
    3. All Generated Subgraphs (Experiments 1 & 2)


The main limitation of the approach for rediscovery and decomposition using semantic predications is that subgraphs were created manually. An approach to automatically cluster paths based on a specification of context, was developed. The next section provides details on this approach for automatic subgraph creation.

Automatic Subgraph Creation

Following our experiments on knowledge rediscovery, the semantic predications were used to automatically generate subgraphs, which capture complex associations between two concepts<ref>D. Cameron, R. Kavuluru, T. C. Rindflesch, A. P. Sheth, K. Thirunarayan, O. Bodenreider, Context-Driven Automatic Subgraph Creation for Literature-Based Discovery. Journal of Biomedical Informatics 54: 141-157 (2015) </ref> (i.e., closed discovery). We developed a method that creates complex associations in the form of subgraphs along different thematic dimensions of association between such concepts. The generated subgraphs were shown to facilitate the rediscovery of 8 out of 9 existing scientific discoveries, including the RS-DFO scenarios from our article in JBI. Each rediscovery scenario is covered in detail in the following tables. The associations from each subgraph in each rediscovery scenario, can be explored using our live web application: http://knoesis-hpco.cs.wright.edu/obvio/ and a video demo is also available online.

Legend

  Not Found
  Found
subgraph x x (subgraph number)
singleton y y (singleton number), where a singleton is a subgraph consisting of only one path
zero rarity singleton a single-path subgraph (or singleton) whose concepts never occur together in any article in MEDLINE


Scenario 1 Intermediate Association Status
Source Target Details
Dietary Fish Oils Raynaud Syndrome Cut-off Date: November 1985

By: Don R. Swanson
Article: (PubMed)

Blood Viscosity Dietary Fish Oils INHIBITS Blood Viscosity Blood Viscosity CAUSES Raynaud Syndrome zero rarity singleton15
Platelet Aggregation Dietary Fish Oils INHIBITS Platelet Aggregation Platelet Aggregation CAUSES Raynaud Syndrome subgraph1
Vascular Reactivity Dietary Fish Oils INHIBITS Vasoconstriction Vasoconstriction CAUSES Raynaud Syndrome  


 

Scenario 2 Intermediate Association Status
Source Target Details
Magnesium Migraine Cut-off Date: April 1987

By: Don R. Swanson
Article: (Pubmed Central)

Calcium Channel Blockers Magnesium ISA Calcium Channel Blocker Calcium Channel Blockers TREATS Migraine subgraph22
Epilepsy Magnesium AFFECTS Epilepsy Epilepsy COEXISTS_WITH Migraine subgraph9
Hypoxia Magnesium INHIBITS Hypoxia Hypoxia ASSOCIATED_WITH Migraine  
Inflammation (Brain Edema, Hydrocephalus) Magnesium INHIBITS Inflammation Inflammation CAUSES Migraine zero rarity singleton3
Platelet Activity Magnesium INHIBITS Platelet Aggregation Platelet Aggregation CAUSES Migraine subgraph1
Prostaglandins Magnesium STIMULATES Prostaglandins Prostaglandins DISRUPTS Migraine Disorders subgraph4
Stress/Type A Personality Stress INHIBITS Magnesium Stress ASSOCIATED_WITH Migraine  
Serotonin Magnesium INHIBITS Serotonin Serotonin CAUSES Migraine subgraph1
Cortical Depression Magnesium INHIBITS Spreading Cortical Depression Spreading Cortical Depression CAUSES Migraine  
Substance P Magnesium INHIBITS Substance P Substance P CAUSES Migraine  
Vascular Mechanisms Magnesium INHIBITS Vasoconstriction Vasoconstriction CAUSES Migraine subgraph9


 

Scenario 3 Intermediate Association Status
Source Target Details
Somatomedin C Arginine April 1989 Don R. Swanson (Pubmed Central) Growth Hormone Arginine STIMULATES Growth Hormone Growth Hormone STIMULATES Somatomedins subgraph5
Body Weight (body mass) Somatomedins (IGF1) STIMULATES Growth Arginine STIMULATES Growth subgraph7
Malnutrition Somatomedins TREATS Malnutrition Arginine TREATS Malnutrition subgraph7
Wound Healing (NK activity) Somatomedin STIMULATES Wound Healing Arginine STIMULATES Wound Healing  


 

Scenario 4 Intermediate Association Status
Source Target Details
Indomethacin Alzheimer’s Disease July 1995 Neil R. Smalheiser/Don R. Swanson (J. Neurol) Acetylcholine Indomethacin INHIBITS Acetylcholine Acetylcholine CAUSES Alzheimer's Disease subgraph4
Lipid peroxidation Indomethacin INHIBITS Lipid peroxidation Lipid peroxidation CAUSES Alzheimer's Disease subgraph2
M2-muscarinic Indomethacin INHIBITS M2-muscarinic M2-muscarinic CAUSES Alzheimer's Disease  
Membrane Fluidity Indomethacin INHIBITS Membrane Fluidity Membrane Fluidity CAUSES Alzheimer's Disease  
Lymphocytes Indomethacin STIMULATES natural killer T-Cell Activity T-Cell Activity INHIBITS Alzheimer's Disease subgraph14
Thyrotropin Indomethacin STIMULATES Thyrotropin Thyrotropin AFFECTS Alzheimer's Disease zero rarity singleton20
T-lymphocytes (T-Cells) Indomethacin STIMULATES T-lymphocytes T-lymphocytes Activity INHIBITS Alzheimer's Disease subgraph3


 

Scenario 5 Intermediate Association Status
Source Target Details
Estrogen Alzheimer’s Disease July 1995 Neil R. Smalheiser/Don R. Swanson (Pubmed) Antioxidant activity Estrogen INHIBITS Antioxidant activity Antioxidant activity CAUSES Alzheimer's Disease subgraph4
Alipoprotein E (ApoE) Estrogen INHIBITS ApoE ApoE CAUSES Alzheimer's Disease subgraph3
Calbindin D28k Estrogen REGULATES Calbindin D28k Calbindin D28k AFFECTS Alzheimer's Disease subgraph4
Cathepsin D Estrogen STIMULATES Cathepsin D Cathepsin D PREVENTS Alzheimer's Disease  
Cytochrome C oxidase subunit III Estrogen STIMULATES Cytochrome Coxidase subunit III Cytochrome Coxidase subunit III AFFECTS Alzheimer's Disease  
Glutamate Estrogen STIMULATES Glutamate Glutamate AFFECTS Alzheimer's Disease  
Receptor Polymorphism Estrogen EXHIBITS Receptor Polymorphism Receptor Polymorphism AFFECTS Alzheimer's Disease  


 

Scenario 6 Intermediate Association Status
Source Target Details
Calcium-Independent PLA2 Schizophrenia 1997 Neil R. Smalheiser/Don R. Swanson (Pubmed) Oxidative stress Oxidative Stress INHIBITS Calcium-Independent PLA2 Oxidative stress CAUSES Schizophrenia singleton2
Selenium Selenium INHIBITS Calcium-Independent PLA2 Selenium PREVENTS Schizophrenia singleton2
Vitamin E Vitamin E INHIBITS Calcium-Independent PLA2 Vitamin E PREVENTS Schizophrenia singleton2


 

Scenario 7 Intermediate Association Status
Source Target Details
Chlorpromazine Cardiac Hypertrophy (Cardiomegaly) 2002 Jonathan D. Wren (PubMed) Calcineurin Chlorpromazine INHIBITS Calcineurin Calcineurin CAUSES Cardiac Hypertrophy subgraph5
Isoproterenol Chlorpromazine INHIBITS Isoproterenol Isoproterenol CAUSES Cardiomegaly subgraph12


 

Scenario 8 Intermediate Association Status
Source Target Details
Testosterone Sleep 2011 Christopher M. Miller/Thomas C. Rindflesch (PubMed) Cortisol/Hydrocortisone Testosterone INHIBITS Hydrocortisone Hydrocortisone DISRUPTS Sleep subgraph7


 

Scenario 9 Intermediate Association Status
Source Target Details
Diethylhexyl phthalate (DEHP) Sepsis 2013 Michael J. Cairelli/Thomas C. Rindflesch (PubMed Central) PParGamma DEHP STIMULATES PParGamma PParGamma INHIBITS Sepsis  


Demos

Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
Video Demo: http://bit.ly/obviodemo
 

Publications

<references/>


SWLBD Workshop

Kno.e.sis and the National Library of Medicine (NLM) organized The First International Workshop on the role of Semantic Web in Literature-Based Discovery (SWLBD2012) in conjunction with The IEEE Conference on Bioinformatics and Biomedicine (BIBM2012) in Philadelphia PA, USA.

  • Due date for full workshop papers submission: Aug 6, 2012
  • Notification of paper acceptance to authors: August 28, 2012
  • Camera-ready version of accepted papers: September 4, 2012
  • Workshop: October 4, 2012

Internal

Obvio Web App
Automatic Subgraph Creation
Recovery and Decomposition
Reachability

Contact: Delroy Cameron