Upload
codemotion
View
1.058
Download
1
Embed Size (px)
DESCRIPTION
Fast Data as a different approach to Big Data for managing large quantities of “in-flight” data that help organizations get a jump on those business-critical decisions. Difference between Big Data and Fast Data is comparable to the amount of time you wait downloading a movie from an online store and playing the dvd instantly. Data Mining as a process to extract info from a data set and transform it into an understandable structure in order to deliver predictive, advanced analytics to enterprises and operational environments. The combination of Fast Data and Data Mining are changing the “Rules”
Citation preview
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!1
Fast Data Mining Real Time Knowledge Discovery for Predictive Decision Making
Nino Guarnacci [email protected]
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!2
Data Explosion
Web & social networks experienced it first…
Infographic by Go-gulf.com
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.!3
… but enterprises are now facing it too … but enterprises are also facing it now
Utilities deploying smart meters? ! 200x information flowing to data center!
• Services and web transaction data (to refine recommendations, detect trends etc.)
• “Sensor” data: • GPS in mobile phones • RFIDs • NFC • SmartMeters • Etc.
• Log file monitoring and analysis • Security monitoring
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!4
93% believe their organization is losing revenue as a result of not being able to fully leverage information67%89%
executives who say drawing intelligence from data is top priority
executives who would grade themselves C or lower in preparedness
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.6 Source: Oracle Research Study - From Overload to Impact: An Industry Scorecard on Big Data Business Challenges, July 2012
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!5
Obstacles to Faster Manage Data – Latency GapWhile Ensuring Accuracy, Efficiency, and Scale
Business event
Action Time
Bus
ines
s Va
lue
Data captured
Analysis completed
Action taken
Fragmented event entities
Source: Richard Hackethorn’s Component’s of Action Time
The Gap
!6
Obstacles to Faster Manage Data – Latency GapWhile Ensuring Accuracy, Efficiency, and Scale
Business event
Action Time
Bus
ines
s Va
lue
Data captured
Analysis completed
Action taken
Fragmented event entities
Source: Richard Hackethorn’s Component’s of Action Time
The Gap
!7
What is Fast Data?Turning High Velocity Data into Value
▪ It’s about getting more from in-flight data ▪ It’s about faster action, faster insights ▪ It’s about running your business in real-time
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!8
Oracle Fast Data ApproachFilter, Move, Transform, Analyze, and Act at High Velocity
ACTANALYZE
MOVE & TRANSFORM
FILTER & CORRELATE
!9
Oracle Fast Data ApproachFilter, Move, Transform, Analyze, and Act at High Velocity
In-Memory Data GridNetwork Status
Real Time Streams
Information
FILTER & CORRELATE
• Parallel Multiple Streams: jms, files, coherence, db,.. • Different Object Type: text, java object…
• High throughput for data Aggregation and Event Querying
Coherence Data Grid holds the data and compute in parallel
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!10
HTTP Pub/Sub
JSON
- Eve
nt S
trea
ms
-
Adapter Cache ProcessorPOJO
EPN (Event Processing Network) Elements
Channel
Event-type
Event-typeEvent-type
Oracle Fast Data ApproachFilter, Move, Transform, Analyze, and Act at High Velocity
<TRACE> <ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY> <TRACED_ENTITY>PACCO</TRACED_ENTITY> <WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED> <WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED> <WHERE_HAPPENED_DETAIL> <OFFICE> <WHERE_DESCRIPTION>MONZA</WHERE_DESCRIPTION> <WHERE_ID>MZ</WHERE_ID> </OFFICE> </WHERE_HAPPENED_DETAIL> </TRACE>
<TRACE> <ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY> <TRACED_ENTITY>PACCO</TRACED_ENTITY> <WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED> <WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED> <WHERE_HAPPENED_DETAIL> <OFFICE> <WHERE_DESCRIPTION>MONZA</WHERE_DESCRIPTION> <WHERE_ID>MZ</WHERE_ID> </OFFICE> </WHERE_HAPPENED_DETAIL> </TRACE>
<TRACE> <ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY> <TRACED_ENTITY>PACCO</TRACED_ENTITY> <WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED> <WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED> <WHERE_HAPPENED_DETAIL> <OFFICE> <WHERE_DESCRIPTION>MONZA</WHERE_DESCRIPTION> <WHERE_ID>MZ</WHERE_ID> </OFFICE> </WHERE_HAPPENED_DETAIL> </TRACE>
SELECT M.SLA_VIOLATED FROM TRACE IN CHANNEL, ENTITIES, SPATIAL CONTEXT MATCH_RECOGNIZE ( MEASURES SLA_VIOLATED PATTERN (A B) DEFINE A (DELIVERY TIME - NOW) < 2 DAYS B DISTANCE BETWEEN (LOCATION, DESTINATION) > 600 KM ) as M
STREAMS
DATABASE
SPATIAL
TIME WINDOW
Oracle Event ProcessingSLA Detection: Pattern Matching
Match Pattern= R 7 ◆
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Oracle Event ProcessingSLA Detection: Filtering & Correlation
ISTREAM( SELECT COUNT(*), START_OFFICE, WHERE_HAPPEND, LATITUDE, LONGITUDE FROM SPATIAL_CONTEXT SLA_VIOLATED_OUT_CHANNEL PARTITION BY START_OFFICE, WHERE_HAPPENED WITHIN 1 HOUR GROUP BY START_OFFICE HAVING COUNT(*) > 5 )
▪ Aggregate and Correlate received filter-events
▪ Partition by Trip-Path probable SLA violations
SELECT M.SLA_VIOLATED FROM TRACE IN CHANNEL, ENTITIES, SPATIAL CONTEXT MATCH_RECOGNIZE ( MEASURES SLA_VIOLATED PATTERN (A B) DEFINE A (DELIVERY TIME - NOW) < 2 DAYS B DISTANCE BETWEEN (LOCATION, DESTINATION) > 600 KM ) as M
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. CONFIDENTIAL – ORACLE RESTRICTEDCopyright © 2013, Oracle and/or its affiliates. All rights reserved.!13
• Identify most important factor • Predict customer behavior • Predict or estimate a value • Find profiles of targeted people or items • Segment a population • Find fraudulent or “rare events” • Determine co-occurring items in a “baskets”
Oracle Fast Data ApproachFilter, Move, Transform, Analyze, and Act at High Velocity
Real-Time Streams analysis, correlate events from different source, manage and use them as a windows and slides relational data.
Automatically sifting through large amounts of data to find previously hidden patterns, discover valuable new insights and make predictions
What is Oracle Data Mining?
!• Identify most important factor (Attribute Importance) • Predict customer behavior (Classification) • Predict or estimate a value (Regression) • Find profiles of targeted people or items (Decision Trees) • Segment a population (Clustering) • Find fraudulent or “rare events” (Anomaly Detection) • Determine co-occurring items in a “baskets” (Associations)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!14
Data Mining Provides Better Information, Valuable Insights and Predictions
Inco
me
Customer Months
Cell Phone Churners vs. Loyal Customers
Insight & Prediction
Segment #1:
IF CUST_MO > 14 AND INCOME < $90K, THEN Prediction = Cell Phone Churner, Confidence = 100%, Support = 8/39
Segment #3:
IF CUST_MO > 7 AND INCOME < $175K, THEN Prediction = Cell Phone Churner, Confidence = 83%, Support = 6/39
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!15
My credit card statement—Can you see the fraud?
May 22 1:14 PM FOOD Monaco Café $127.38 May 22 7:32 PM WINE Wine Bistro $28.00 … June 14 2:05 PM MISC Mobil Mart $75.00 June 14 2:06 PM MISC Mobil Mart $75.00 June 15 11:48 AM MISC Mobil Mart $75.00 June 15 11:49 AM MISC Mobil Mart $75.00 May 28 6:31 PM WINE Acton Shop $31.00 May 29 8:39 PM FOOD Crossroads $128.14 June 16 11:48 AM MISC Mobil Mart $75.00 June 16 11:49 AM MISC Mobil Mart $75.00
Monaco?Gas Station?
All same $75 amount?
Pairs of $75?
Tota
l pur
chas
es e
xcee
ds
time
perio
d av
erag
e
A Real Fraud Example
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!16
“Essentially, all models are wrong, …but some are useful.”
- George Box (One of the most influential statisticians of the 20th century and a pioneer in the
areas of quality control, time series analysis, design of experiments and Bayesian inference.)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!17
You Can Think of It Like This…
Traditional SQL• “Human-driven” queries • Domain expertise • Any “rules” must be
defined and managed
• SQL Queries • SELECT • DISTINCT • AGGREGATE • WHERE • AND OR • GROUP BY • ORDER BY • RANK
Oracle Data Mining• Automated knowledge
discovery, model building and deployment
• Domain expertise to assemble the “right” data to mine !
• ODM “Verbs” • PREDICT • DETECT • CLUSTER • CLASSIFY • REGRESS • PROFILE • IDENTIFY FACTORS • ASSOCIATE
+
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!18
Real-time Prediction for a Customer!• On-the-fly, single record apply with new data (e.g. from call center) !
Select prediction_probability(CLAS_DT_5_2, 'Yes' USING 7800 as bank_funds, 125 as checking_amount, 20 as credit_balance, 55 as age, 'Married' as marital_status,
250 as MONEY_MONTLY_OVERDRAWN, 1 as house_ownership)
from dual;
Web
Branc
CRM
Call
Social
MobileGet
ECM BI
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!19
Predictive and Recommendation Analytics
• Combine Real Time Event Streaming Data Technologies with the Industry leading Oracle Historical Data Mining: – Oracle Data Mining
• Rich set of Algorithms for Data Mining • Predict Customer Behavior • Find Profiles of Targeted People or Items, and
determine important relationships • Immediately Predict Trends and Themes for Data in
motion • Respond to Prevent Business Threats and take
Advantage of Opportunities
Real Time Data Mining Modeling with Streaming Events
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13Copyright © 2012, Oracle and/or its affiliates. All rights reserved.!20 http://www.sail-world.com/USA/Americas-Cup:-Oracle-Data-Mining-supports-crew-and-BMW-ORACLE-Racing/68834
Acting Oracle Data Mining: Technology Behind the America’s Cup Win
• “The USA holds 250 sensors to collect raw data: pressure sensors on the wing; angle sensors on the adjustable trailing edge of the wing sail to monitor the effectiveness of each adjustment, allowing the crew to ascertain the amount of lift it’s generating; and fiber-optic strain sensors on the mast and wing to allow maximum thrust without over bending them. !
• But collecting data was only the beginning. ORACLE Racing also had to manage that data, analyze it, and present useful results……
!21
▪ Extract Knowledge starting from a csv file ▪ Execute Anomaly Detection Mining on stored data ▪ Put in place a RealTime Event Processing Flow ▪ Consuming event from In-Memory Data Grid ▪ Obtain instantly Fraud Prediction from :
Fast Data Mining Demo: Fraud Prediction in action…
Streaming Data
!22
Q & A
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.!23
Thanks !Fast Data Mining Real Time Knowledge Discovery for Predictive Decision MakingNino Guarnacci [email protected]