Upload
darren-watson
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Copyright © 2014 Tapestry Solutions, Inc. All rights reserved.
Role of Ontologyin
‘Big Data’
Jens Pohl, PhD
Monday 28 July, 2014
TAPESTRY / MIRO – PROPRIETARY
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 2
Origins of Big DataBig Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
By-product of the evolution of larger and more complex societal structures.
Result of the exponential increase in data due to global connectivity.
Big Data is not a completely new phenomenon.
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 3
Planning is a Critical ToolBig Data & Ontologies
Expectation that the plans will be effective.
Decisions must be made in a timely manner.
Forecasts must be at least reasonably accurate.
Organizational complexity generates a need for efficiency through planning.
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 4
Planning is PredictiveBig Data & Ontologies
Plans are based on assumptions.
Assumptions are predictive in nature.
Forecasting future conditions and events based on past experience is problematic.
Planning and forecasting are closely related.
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 5
Forecasts are Mostly WrongBig Data & Ontologies
Western Union Exective – 1876:
"The telephone has too many shortcomings to be seriously considered as a means of communication."
Lord Kelvin – 1895:
"Heavier-than-air flying machines are not possible."
Thomas Watson, IBM Chairman – 1943:
"I think in the world there is a market for maybe five computers."
Ken Olson – 1977:
"There is no reason for individuals to have a computer in their home."
Bill Gates – 1981:
"64.000 bytes of memory ought to be enough for anybody."
Robert Metcalfe (inventor of the Ethernet):
"The Internet will catastrophically collapse in 1996."
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 6
Early Data Analysis ProblemsBig Data & Ontologies
We rely largely on the analysis of past events to identify future trends.
Periodic collection of population census data (every 10 years in the US).
Collection of data is time consuming, but the analysis of the data is even more onerous.
The 1880 US census took 8 years to process.
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Traditional Big Data Analysis
InferentialStatistics
Collection of very small sample.
Representativeness ensured by randomness.
Mathematical analysis of sample.
Predictions about entire corpus of data.
1
2
3
4
TraditionalBig Data Analysis
Hypothesis based on theory(s).
Collection of represen-tative data sample.
Correlation analysis of random sample.
Testing of hypothesis (and data).
1
2
3
4
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 7
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY
.
8
New Big Data Analysis ApproachBig Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Correlation: If A occurs with B then we can predict that A is likely to occur wherever B occurs; - i.e., B is a proxy for A.
The analysis is based on a data set that is essentially equivalent to the entire corpus of data.
Assumption: Any data domain changes are gradual and not abrupt.
Assumption: The corpus of data is continuously extended with new domain data.
The correlation is a probabilistic likelihood and not an absolute certainty.
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Blind correlation through
brute force computation
Massive Data
Automatedextraction of
meaning
What!
Particular Knowledge
Why!
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 9
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 10
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
From Data to Knowledge
LOW VOLUME HIGH VALUE
HIGH VOLUME LOW VALUE
KNOWLEDGE(INTERPRETATIONS AND RULES)
INFORMATION(RICH IN RELATIONSHIPS)
PURPOSEFULDATA
(ORGANIZED)
LOW LEVEL DATA(UNORGANIZED)
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 11
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Wasteful Use of Human Resources
Human computer user must interpret and manipulate data by adding context..
Context
Data without context cannot be automatically
interpreted by computers
Knowledge
Information
Organizedand
UnorganizedData
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 12
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Fundamental Distinctions
rain airport 6 to dense clear for not Scotland in hours field 117 pilot expected 49 Glasgow railcar fog week 82
"…dense fog in Glasgow,
Scotland, not expected to clear for 6 hours…"
KNOWLEDGE comprises inferences derived from
information.
Aircraft bound for Glasgow International
Airport are likely to be rerouted or
delayed.
INFORMATION is numbers and words with relationships.
DATA are numbers and words without
relationships.
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 13
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Principal Context Components
History
Time
Location
Environment
Identity
CultureUrgency
Activity
"...any information that characterizes the interaction of entities
(i.e., players and objects), within a given situation…"
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 14
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Context as an Enabler
Context is a prerequisite for …
Automated interpretation of data.1
Automated filtering of data.2
Automated retrieval of useful data.3
Intelligent collaborative decision tools.4
Self-healing and secure information networks.5
Responsive human-computer interfaces.6
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 15
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Virtual Model of Real World Context
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 16
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Ontology Representation
Defines the innate nature and operational context within which the actual values of entities can be accurately interpreted.
Rich Relationships
Logic (Business Rules)
ModelingPatterns and Techniques
Provides: Semantic context.
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 17
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Ontology Construction
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 18
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Ontology of Real World Context
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 19
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Human-Computer Partnership
Ontology provides context for automated reasoning by software agents.
Human Context
Computer Context
Organized Data
Information
Knowledge
Unorganized Data
Ontology
Data capture in context
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 20
Load - Planning from a Data ViewpointBig Data & Ontologies
The common parameters of Big Data are Volume, Velocity and Variety (3Vs).
Over 30,000 cargo items per ship.
From one load-plan in two days to four load-plans in two hours.
Over 300 attributes per cargo item, 320 ships, 250 aircraft configurations, and over 15,000 railcars.
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Volume
Velocity
Variety
ICODES-GS v6: Overview
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 21
ICODES v6 Portal
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Evolution of ICODES: 2011: Fielding of ICODES GS
Within a Collaborative Information Workspace (CIW), ICODES GS becomes a set of intelligent reusable services, with user-transparent data exchange capabilities, which are
accessed through a single sign-on portal.
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 22
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
ICODES GS: Applications and Services
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 23
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
ICODES-GS v6: SLP
Single Load Planner (SLP)
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 24
Single Load Planner (SLP)
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 25
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
ICODES - SLP: Knowledge Domains
The core of ICODES GS is its knowledge-base of context and business rules that allow agents to automatically interpret data changes and provide useful assistance to the operator.
ICODES knowledge domains include:
ICODES user-interfaces include:
ICODES: Ontology-Based Multi-Agent System
HazardAgent
HatchesAgent
DoorsAgent
OpeningsAgent
AccessAgent
RampsAgent
USER
MULTI-MEDIA
Trim & Stability Agent
LayeredOntology
CADEngine
CranesAgent
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 26
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
ICODES : Ship Load-Planning User-Interface
Tool BarsAgent Status
Bar
GraphicsWindow
StatusBar
Main Menu Bar
AssociationsToolbar
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 27
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
ICODES GS: Four Domains – Aircraft, Ship, Rail and Yards
Aircraft Ship
Rail Yards
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 28
Big Data & Ontologies
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
ICODES-GS v6: DC
Data Cleanser (DC)
Data Cleanser (DC)
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 29
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
ICODES-GS v6: DC
A set of components that serve as the source of reference data for ICODES v6 and provides the operator with the capability to validate, correct, and automatically populate cargo data.
• Problem: Incorrect or partial user input.
• Solution: Validate and Auto-Fill using Ref. Data
• MARVEL AES-based solution
ICODES 6Reference
Data
Data Cleanser Service
ICODES 6 Applications & Services
reads reads
Data Cleanser (DC)
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 30
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
ICODES-GS v6: DCDC Web-Application
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 31
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
ICODES-GS v6: IR
Information Repository (IR)
Information Repository (IR)
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 32
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems
ICODES-GS v6: IRInformation Repository (IR)
ICODES v6 Components
• Web Service
• Define and list categories of data
• Publish data to a category
• Remove data from a category
• Browse, search, and retrieve data residing in a category
• Standalone ApplicationProvides a user interface that enables an end-user to utilize the IR Web Service capabilities.
• Embeddable Components
Provides a unified software library that allows ICODES v6 applications to present standardized dialogs for import and export.
EnterpriseUsers
DesktopUsers
SLP
CB
IR StandaloneApplication
EIP Environment
SLP, CE,BBT, andCB
DBIR Service
ICODES v6 components that provide operators, services, and applications with a centralized location for sharing data in support of user collaboration.
Copyright © 2014 Tapestry Solutions, Inc. / Miro Technologies, Inc.All rights reserved.
TAPESTRY / MIRO – PROPRIETARY 33
Global Services & Support | Training Systems and Government Services | Logistic Information Management Systems