Upload
merryl-austin
View
215
Download
2
Tags:
Embed Size (px)
Citation preview
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 1
The Evolution of Semantic Technologies-The Value of Merging Smart Data With Big Data
Eric Little, PhD
VP – Chief Scientist
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 2
Who is Modus Operandi? Privately held small
business headquartered in Melbourne, FL.• Satellite locations in
Aberdeen, MD and Ft. Huachuca, AZ.
• 82% of employees possess a security clearance.
U.S. Government is our primary customer• Expanding into select
commercial markets
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 3
IT’s common challenge TOO BIG - Too much data or too many
variables
SILOED - Data is in legacy silos so nothing is integrated
LOST EXPERTISE - SME info is lost in people’s heads
NO EXCHANGE - No good processes for data exchange
NO VIZ - No good ways to visualize data
NO QUALITATIVE - Cannot use statistical tools to get qualitative answers
DIRTY DATA - Too many errors in the data
NO RULES - No way to capture business rules without big coding effort
NO VOCAB - No good vocabularies exist to capture data elements
MANY MODELS - Too many data models to be controlled effectively
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 4
MODUSAPPROA
CHRather than build one type of technology we
realize the need for an end-to-end
platform to provide solutions for our
customers
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 5
The “Cognitive Evolution” of Intelligent Software Semantic technologies are part of an IT evolution from code to data centricity
• In the Code-Centric years, data was often stored in flat files with no structure, while complex, procedural “edit” programs contained all knowledge about the data
• The creation of databases, specifically Network and RDBMS, was one of the first steps leading to Data-Centric evolution
• The last decade has seen standards such as XML, RDF, Web services, and now OWL, that further evolve IT to a Data-Centric environment
Big data and scalability is now helping to shape semantic tech at large scale.
Big data science
Retrieval at Scale is most important
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 6
New & Expanding Tech Areas The past few years have seen
a significant rise in new tech fields• data science, big data analytics,
semantic technologies, natural language processing, graph computing, and systemics
These areas provide new paradigms for data analysis and integration
These are driving new innovations in the ways people can access and use their data.
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 7
Innovation is Key in These Types of Tech Spaces The idea seems straight-forward
and easy• But it is difficult to find true spots of
Blue Ocean Requires new approaches that are
taken from numerous disciplines Small businesses need to compete
by focusing and being disruptive• Being disruptive involves the
counter intuitive approach of focusing on specific market segments
• Requires an ability to be nimble and respond quickly to needs (iterative prototyping)
• Every wave is different – reading the wave is key
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 10
Semantic Approach Improves Data Access
10
Traditional Approach Semantic Approach
Database Experts
Domain Experts & Scientists
Systems Engineers
Management & Executives
• Manual Data Correlation• Manual Report Generation
(High Potential for Error)• Integrated Classifications/Schemas
• Automated Reasoning Capabilities(Significant Error Reduction)
Domain Experts & Scientists Systems
EngineersManagement & Executives
Ontology Engine
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 11
Semantic Approach Simplifies Queries
Traditional Approach
Database Experts
Query Must Contain:1. Data Requirements2. All Logic Required to Relate the Data
(Rules, Joins, Decode, Sub-queries, etc.)
Complexity: HIGHReusability: LOW-MED
Semantic Approach
Reasoning is done on the user side for each query
Reasoning is performed by Ontobroker within the system
Database Experts
Scientists, Systems
Engineers
Management & Executives
Query Must Contain:1. Data Requirements only
Complexity: LOWReusability: HIGH (Logic embedded in Model)
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 12
12
Building Semantic Profiles From Raw Data
• Key data elements are identified – creating lexicon of important terms
• Data elements are categorized into appropriate classes – ranges are captured for autoclassification
• Can be applied to any type of data elements: equipment, reports, products, processes, etc.
• Advanced logics allow for reasoning over data sets such that new patterns and information can be gained
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 13
Utilizing Semantics to Integrate Disparate Medical Data
807862 ACETAMINOPHEN (TYLENOL) LEVEL855406 Acetaminophen (Tylenol) Level
2273960 Acetaminophen (Tylenol) Level7543180 ACETAMINOPHEN (TYLENOL) LEVEL
253965 Acetaminophen [ Tylenol]512270 Acetaminophen + Codeine
3016745 Acetaminophen + Codeine6075682 Acetaminophen + Codeine6327790 Acetaminophen + Codeine1688184 Acetaminophen + Codeine (120mg-12mg/5ml) (NF) Liquid1701785 Acetaminophen + Codeine (120mg-12mg/5ml) (NF) Liquid3967939 Acetaminophen + Codeine (120mg-12mg/5ml) (NF) Liquid
7271363 Acetaminophen - 325 mg PO q4h PRN Temp >1017881183 Acetaminophen - 650 mg PO q6h PRN Pain
64654 ACETAMINOPHEN SUPP 325 MG SUPP4851508 ACETAMINOPHEN SUPP 325 MG SUPP9870184 Acetaminophen Tab 325MG
679752 ACETAMINOPHEN TAB 325 MG TAB1715007 ACETAMINOPHEN TAB 325 MG TAB2292336 ACETAMINOPHEN TAB 325 MG TAB3914196 ACETAMINOPHEN TAB 325 MG TAB6768031 ACETAMINOPHEN TAB 325 MG TAB8163956 ACETAMINOPHEN TAB 325 MG TAB9629590 ACETAMINOPHEN TAB 325 MG TAB6802504 Acetaminophen (160mg/5ml) Suspension
Hospital 1 Data
Hospital 2 Data
• Disparate data sources can be ingested by the system and automatically classified into their appropriate class, attributes, etc.
• The models only need to be developed initially with the help of medical SMEs (as opposed to continuous point-to-point mappings with traditional systems).
Common Data Model
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 14
Classification Schemas Must Reflect Subject Matter Expertise
Orbis Technologies, Inc. Proprietary 14
• SME’s are often ill equipped to capture their knowledge semantically• Knowledge can be captured in ontologies (as attributes, advanced
relationships, etc.) – but this requires a separate skills set• Multiple ontologies can be integrated to capture enterprise-wide
applications for advanced business intelligence
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 15
Federated Ontology Layers Allow for Advanced Data Modeling
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 16
Putting It All Together Into A Platform
Unstructured Outcomes Data
Structured Data
Customizable User Interfaces
Ontology Engine
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 17
BIG DATA – NOW THAT YOU HAVE SEMANTICS, HOW TO SCALE…
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 18
The Problem of Big Data is Real (And Closing In) The past couple of
decades have been spent on data gathering and storage
Most Data Stores were not built to get data out
The new push is connecting data
New high-performance systems are required to meet those needs • Data solutions must be
big, smart, and easy to deploy
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 19
Big Data Analytics Challenge for Intelligent Systems
Data Analysis in the many spaces requires near-real-time decision support tools.
Connecting the dots is paramount to successful and effective analysis
This requires a culmination of new techniques that combine robust data modeling and linkage (e.g., graphs) with high-performance computing capability
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 20
Capturing Complex Data Is Difficult
People are now attempting to utilize their data like never before.• Semantics has shown significant
promise but has not scaled well in the past.
• Entities, attributes, locations, temporal signatures, etc. result in data explosions
Breakthroughs in cloud computing and high performance graph stores are providing new ways to innovate data science.
Multiple users can now apply perspectives
Can be driven to an entire enterprise
Built on Standards-based Approaches
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 21
Scaling semantics
Semantics has not scaled well in the past Entities, inferred data, facets, over time, with
quality attribution,… = a data explosion Our newest breakthroughs in cloud computing and
high performance graph stores allow semantics-at-scale
BIG GRAPHS
+ +
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 22
Scaling semantics (cont.)
Enterprise-level graph computing requires cutting edge technology components
Data Ingest at Cloud scale – must be able to ingest millions of entities and thousands of documents per second. (Modus Operandi Wave Engine)
Data Storage (Triple Stores and Cloud DBs)• 60 billion triples, sub-second queries, thousands of
unstructured docs processed per second
Data Traversal (High-performance UI’s) – app stores and BI tools to provide a diverse user experience
High-performance triple store
Semantic Reasoner
Graph-based
Appliances
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 25
EASYSMARTBIG
Easy UI’s Leverage Common Models
Our user Interfaces are designed around common use models
HDFS / Hadoop / MapReduce
Accumulo Key Value Store
Semantic search
Geospatial views
Semantic Wiki – collaborate
Timelines
ExploreVisualizations
Large-scaleSemantic triple stores with reasoning
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 26
Driving the Knowledge to Multiple Users Combining software tools in innovative ways allows for multiple
users to view the same data at once.• These technologies are providing new platforms that are driving new
ways to utilize advanced analytics like never before Information can be driven to multiple users in near-real-time for
improved decision support
End Users
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 27
Visualizing patterns
Correlations, associations and patterns require special purpose visualizations
Our ExtJS/Ozone framework enables fast assembly of point solutions
Patterns recognition leads to prediction
Prediction leads to prevention
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 28
Big Data Results in a Highly Intuitive UIs is Key
Complex data does not require complex UIs• Many new tech
innovations involve simple, intuitive front ends (apps)
Users must be able to quickly manipulate information
Must be able to quickly derive answers
Different technologies must be integrated into a common look and feel
Jan. 22, 2013 | © 2013 Modus Operandi, Inc. | 29
Providing an End-to-End Solution Many companies our size provide a capability or two
• Modus Operandi provides a complete platform for a multitude of user applications (and growing).
Information can be ingested from nearly any source (structured, semi-structured or unstructured).• Common models such as UMLS, Ucore-SL, BFO, etc.• Custom models can be created based on project specifics.
Information is stored in a high-performance graph knowledge base (we can integrate numerous ones – currently using Bigdata, Rya and Allegrograph).
Results can be driven to a wide variety of easy-to-use UI’s that can be highly customized to fit user needs.
Smart + Big + Easy provides a new means to successfully apply semantic technologies to large scale graph computing.