Upload
inside-analysis
View
1.432
Download
0
Tags:
Embed Size (px)
Citation preview
Twitter Tag: #briefr
The Briefing Room
! Reveal the essential characteristics of enterprise software, good and bad
! Provide a forum for detailed analysis of today’s innovative technologies
! Give vendors a chance to explain their product to savvy analysts
! Allow audience members to pose serious questions... and get answers!
Mission
Twitter Tag: #briefr
The Briefing Room
Topics
This Month: ANALYTICS
February: BIG DATA
March: CLOUD
2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room
Twitter Tag: #briefr
The Briefing Room
Analyst: John Myers
John Myers is Research Director of Business
Intelligence at Enterprise Management
Associates
Twitter Tag: #briefr
The Briefing Room
SQLstream
! SQLstream is an enterprise software company focused on making businesses responsive to real-time Big Data assets
! Its platform provides a relational stream for analyzing large volumes of service, sensor, and machine and log file data
! SQL queries in SQLstream generate results continuously as data becomes available
Twitter Tag: #briefr
The Briefing Room
Guests: Damian Black & Christian Lees
Damian Black CEO, SQLstream
Christian Lees CTO, InfoArmor
• Career in high tech, real-time software sector, with senior positions at HP, XACCT (now Amdocs) and Followap (now Neustar)
• Holds 11 US patents
• Finalist in the 1995 International Management Challenge
• Over 15 years of information security, network security and intrusion detection experience
• CTO of InfoArmor, with previous experience at Level 3 Communications, Trustwave and owner of Sage Technologies
| 10 Copyright © 2014 | +1 877 571 5775 | [email protected]
SQLs t ream: Real - t ime B ig Data P la t form
facts
o Launched 2009
o Deployments across many industries
o Real world benchmarks
capabilities
o Unstructured and structured data
o Accelerates and extends Hadoop & RDBMS
o Not only SQL
innovations
o Massively scalable streaming data platform
o Only standard SQL streaming engine
o Five patents for stream processing
Streaming Analytics from
High-velocity Machine Data
| 11 Copyright © 2014 | +1 877 571 5775 | [email protected]
Se lec ted Cus tomers & Par tners
Internet of Things & Sensors
Intelligent Transportation
IT Operations
Telecommunications Security Intelligence
Smarter Internet
Selected Strategic Partners
| 12 Copyright © 2014 | +1 877 571 5775 | [email protected]
B r idg ing The Chasm
As we move toward a real-t ime business e n v i ro n m e n t , t h e capability to process data flows swiftly and flexibly will become i n c r e a s i n g l y important. SQLstream leads the industry in t h i s k i n d o f capability.
“
” Robin Bloor
Chief Analyst for Bloor Group
Business Intelligence
Post-hoc Analysis
Data Warehousing
Strategic insights
Operations
Transaction Processing
Machine Data
Everyday business
Operational Intelligence integrates Operations and BI
| 13 Copyright © 2014 | +1 877 571 5775 | [email protected]
B r idg ing The Chasm
Business Intelligence
Post-hoc Analysis
Data Warehousing
Strategic insights
Operations
Transaction Processing
Machine Data
Everyday business
Operational Intelligence integrates Operations and BI
Operational Intelligence Optimizes tactical decisions from real-time actionable insights
Combines operations data with BI data continuously
Provides Real-time integrated view of the business and operations
Security Compliance
Fraud Quality
Promotion Advertising Cross-selling
As we move toward a real-t ime business e n v i ro n m e n t , t h e capability to process data flows swiftly and flexibly will become i n c r e a s i n g l y important. SQLstream leads the industry in t h i s k i n d o f capability.
“
” Robin Bloor
Chief Analyst for Bloor Group
| 15 Copyright © 2014 | +1 877 571 5775 | [email protected]
T he I n fo r ma t i on Va l ue Cha i n
What is happening?
What might happen?
What just happened?
Make it happen!
| 17 Copyright © 2014 | +1 877 571 5775 | [email protected]
Ana ly t i c s p rev i ou s l y mean t H igh - la ten cy
Current architectures o Multi-stage processing o Batch ETL o Interim operational data stores
IMPACT o High Cost of Ownership o Delays to internal customers and consumers o Delays to external customers and partners
WAREHOUSE
Near-term data storage
PLATFORMS
ETL
| 18 Copyright © 2014 | +1 877 571 5775 | [email protected]
S t r eam ing Ana ly t i c s M a s s i v e l y p a r a l l e l w i t h i n c r e m e n t a l e v a l u a t i o n
Enhancing with historical information
Storage of intermediate & final query results
Ope
ratio
nal I
ntel
ligen
ce
Logs
Sensors
Mobile
Networks
Wireless
Radio
M2M
Internet
Security gateways
¤ Continuous queries on unstructured & structured streaming data ¤ Incremental query results ¤ Predictive analytics & automated actions
| 19 Copyright © 2014 | +1 877 571 5775 | [email protected]
SQL W h e r e i s t h e i n t e l l i ge n c e ?
TRANS,2013-02-17-15:30:22,3458783,2347897953,128.56.0.253,STATUS:-15, DE69975, 4157588342 Transaction Log Details
Web Server Logs
CDRs
Device Locations
Twitter {"created_at:Thu Feb 17 15:30:55 +0000 2013,id:304612775055998976,id_str:304612775055998976,text:@MyServiceProvider today sucks, keeps dropped!,source:u006ca href=http:www.url.com rel=nofollow,followers_count:147,friends_count:10142, location: San Francisco, time_zone: Pacific, geo_enabled:true, location:u00dcT: -6.1987552,106.8661953, screen_name:APerson
<id>1597831220</id><deviceid>0198873465</deviceid><lat>lat=47.643957</lat><lon>lon= -122.3269</lon><time>2013-02-17T15:37:26Z</time><bearing>223.4535</bearing>
<id>1597865781</id><deviceid>0198873465</deviceid><lat>lat=47.645982</lat><lon>lon=-122.327500</lon><time>2013-02-17T15:37:26Z</time><bearing>200.6138</bearing>
<id>1597940125</id><deviceid>0198873465</deviceid><lat>lat=47.647381</lat><lon>lon=-122.326501</lon><time>2013-02-17T15:37:26Z</time><bearing>87.4357</bearing>
[Sun Feb 17 15:30:49 2013] [notice] srv-sfo-08 caught SIGTERM, shutting down [Sun Feb 17 15:30:49 2013] [notice] Apache/2.2.21 -- resuming normal operations
TERMINATE,ctl09gsx,01299796304,GMT-08:00,02-17-13,15:21:00,9,387,64ms,02-17-13,15:30:55,0005, IP-TO-IP,4157588342,8775715775,1,0,4157588342,RD_AXY_NN0_001,SFR01AAG34,40.50.245.60, 234.234.60.75,65678,411,399,SIP,SANFRANCISCO,0x4B1698,0x0005E,0x49768,4157588342,0198873465
| 20 Copyright © 2014 | +1 877 571 5775 | [email protected]
SQL W h e r e i s t h e i n t e l l i ge n c e ?
TRANS,2013-02-17-15:30:22,3458783,2347897953,128.56.0.253,STATUS:-15, DE69975, 4157588342 Transaction Log Details
Web Server Logs
CDRs
Device Locations
Twitter {"created_at:Thu Feb 17 15:30:55 +0000 2013,id:304612775055998976,id_str:304612775055998976,text:@MyServiceProvider today sucks, keeps dropped!,source:u006ca href=http:www.url.com rel=nofollow,followers_count:147,friends_count:10142, location: San Francisco, time_zone: Pacific, geo_enabled:true, location:u00dcT: -6.1987552,106.8661953, screen_name:APerson
<id>1597831220</id><deviceid>0198873465</deviceid><lat>lat=47.643957</lat><lon>lon= -122.3269</lon><time>2013-02-17T15:37:26Z</time><bearing>223.4535</bearing>
<id>1597865781</id><deviceid>0198873465</deviceid><lat>lat=47.645982</lat><lon>lon=-122.327500</lon><time>2013-02-17T15:37:26Z</time><bearing>200.6138</bearing>
<id>1597940125</id><deviceid>0198873465</deviceid><lat>lat=47.647381</lat><lon>lon=-122.326501</lon><time>2013-02-17T15:37:26Z</time><bearing>87.4357</bearing>
[Sun Feb 17 15:30:49 2013] [notice] srv-sfo-08 caught SIGTERM, shutting down [Sun Feb 17 15:30:49 2013] [notice] Apache/2.2.21 -- resuming normal operations
TERMINATE,ctl09gsx,01299796304,GMT-08:00,02-17-13,15:21:00,9,387,64ms,02-17-13,15:30:55,0005, IP-TO-IP,4157588342,8775715775,1,0,4157588342,RD_AXY_NN0_001,SFR01AAG34,40.50.245.60, 234.234.60.75,65678,411,399,SIP,SANFRANCISCO,0x4B1698,0x0005E,0x49768,4157588342,0198873465
Timestamp
Timestamp
Timestamp
Timestamp
Timestamp
| 21 Copyright © 2014 | +1 877 571 5775 | [email protected]
SQL W h e r e i s t h e i n t e l l i ge n c e ?
TRANS,2013-02-17-15:30:22,3458783,2347897953,128.56.0.253,STATUS:-15, DE69975, 4157588342 Transaction Log Details
Web Server Logs
CDRs
Device Locations
Twitter {"created_at:Thu Feb 17 15:30:55 +0000 2013,id:304612775055998976,id_str:304612775055998976,text:@MyServiceProvider today sucks, keeps dropped!,source:u006ca href=http:www.url.com rel=nofollow,followers_count:147,friends_count:10142, location: San Francisco, time_zone: Pacific, geo_enabled:true, location:u00dcT: -6.1987552,106.8661953, screen_name:APerson
<id>1597831220</id><deviceid>0198873465</deviceid><lat>lat=47.643957</lat><lon>lon= -122.3269</lon><time>2013-02-17T15:37:26Z</time><bearing>223.4535</bearing>
<id>1597865781</id><deviceid>0198873465</deviceid><lat>lat=47.645982</lat><lon>lon=-122.327500</lon><time>2013-02-17T15:37:26Z</time><bearing>200.6138</bearing>
<id>1597940125</id><deviceid>0198873465</deviceid><lat>lat=47.647381</lat><lon>lon=-122.326501</lon><time>2013-02-17T15:37:26Z</time><bearing>87.4357</bearing>
[Sun Feb 17 15:30:49 2013] [notice] srv-sfo-08 caught SIGTERM, shutting down [Sun Feb 17 15:30:49 2013] [notice] Apache/2.2.21 -- resuming normal operations
TERMINATE,ctl09gsx,01299796304,GMT-08:00,02-17-13,15:21:00,9,387,64ms,02-17-13,15:30:55,0005, IP-TO-IP,4157588342,8775715775,1,0,4157588342,RD_AXY_NN0_001,SFR01AAG34,40.50.245.60, 234.234.60.75,65678,411,399,SIP,SANFRANCISCO,0x4B1698,0x0005E,0x49768,4157588342,0198873465
Timestamp
Timestamp
Timestamp
Timestamp
Timestamp
Mobile # Customer
Mobile # Device ID Term Reason
Device ID Location
Location
Service Provider
Fail Code
Server
| 22 Copyright © 2014 | +1 877 571 5775 | [email protected]
CLEANING & FILTERING
STREAMING ANALYTICS
STREAMING AGGREGATION
CONTINUOUS INTEGRATION
QoE Rating Fraud
Monitoring Billing
Network Analysis
S t ream ing Ana ly t i c s P la t fo r m
Log Sensors Mobile Networks M2M Radio towers
| 23 Copyright © 2014 | +1 877 571 5775 | [email protected]
Data Warehouse
Rea l - t ime A r c h i t e c t u re Continuous Raw Data Ingestion, Integration, Analysis and Output of Derived Data in Real-time
Streaming Agent/Adapter Layer + JDBC API
Query Planner & Optimizer for MPP Execution SQL
Developer Tools
Platform Administration
Streaming SQL Real-time Applications
Real-time Dashboards & Visualization
Logs
Sensors
GPS
Networks Social Media Servers
M2M Telematics
Impala SQL
HBase
HDFS / MR
Hadoop for Stream Persistence, Enrichment & Replay (Optional)
External Data Warehouses & Systems
| 24 Copyright © 2014 | +1 877 571 5775 | [email protected]
SQL s t ream s -S t ream ing P roduc t Po r t fo l i o
s-Server Data Management Platform for Streaming Big Data
s-Analyzer Drag and Drop Application Builder for
Streaming Analytics Applications
s-Transport Geo-Analytics for Location-based
Applications
s-Visualizer Advanced Enterprise
Visualization
s-Cloud s-Server EC2 AMI Deployment
s-St
udio
D
evel
oper
& A
dmin
Con
sole
Stre
amA
pps
Fast
Sta
rt S
trea
min
g A
pps
Dashboards
| 26 Copyright © 2014 | +1 877 571 5775 | [email protected]
SELECT STREAM ROWTIME, url, numErrorsLastMinute FROM ( SELECT STREAM ROWTIME, url, numErrorsLastMinute, AVG(numErrorsLastMinute) OVER lastMinute AS avgErrorsPerMinute, STDDEV(numErrorsLastMinute) OVER lastMinute AS stdDevErrorsPerMinute FROM ServiceRequestsPerMinute WINDOW lastMinute AS (PARTITION BY url RANGE INTERVAL ‘1’ MINUTE PRECEDING) ) AS S WHERE S.numErrorsLastMinute > S.avgErrorsPerMinute + 2 * S.stdDevErrorsPerMinute;
CLOUD INFRASTRUCTURE MONITOR ING C l o u d i n f r a s t r u c t u r e m o n i t o r i n g w i t h B o l l i n ge r b a n d s
BUSINESS NEED: Detect run-away applications
before resource consumption becomes an issue.
| 27 Copyright © 2014 | +1 877 571 5775 | [email protected]
SQLstream
Cus tomer Benc hmarked Performance L a r g e n e t w o r k & t e l e c o m e q u i p m e n t m a n u f a c t u r e r
Network Data
Network Data
Network Data
Network Data
Network Data
ENRICH SHARE ANALYZE
Remote Agent
Remote Agent
Remote Agent
Remote Agent
Remote Agent
Data Warehouse
External Systems
External Data
PERFORMANCE STATISTICS System Throughput: 1.35M events / sec
Server Configuration: 1 x 4-core CPU
Event Size: ~1KB
Data Sources: Many
SYSTEM CHARACTERISTICS Collection: Intelligent Remote Agents (Distributed)
Enrichment: Streaming data augmentation
Analytics: Temporal & spatial pattern detection
Output: Data warehouse + applications (JDBC)
| 28 Copyright © 2014 | +1 877 571 5775 | [email protected]
“SQLstream allows Veracity to provide vital real-time reports to our customers that previously took hours to create. SQLstream also provides real-time monitoring and insight into network concerns allowing Veracity to proactively address any such issues.”
Case s tudy : Cal l Rat ing & Fraud
Veracity Networks
| 29 Copyright © 2014 | +1 877 571 5775 | [email protected]
Case s tudy : f raud prevent ion ( con t . )
Alerts
Triggers
Reports
STREAMING
ANALYTICS
• Call suspension • Acct. suspension • Emails
Destination
Location
IP spoofing alerts
Customer call profile
dura
tion
Mo Tue Wed Thu Fri Sat Sun
① LA ② Nairobi ③ NY ④ …..
① LA ② SF ③ NY ④ ….
① LA ② Detroit
① LA ② LA1
| 31 Copyright © 2014 | +1 877 571 5775 | [email protected]
Case s tudy : Cybersecur i ty
InfoArmor
¤ Founded by Washington Mutual to protect 10M credit card holders
¤ Growing at triple digit rates ¤ Engaged, satisfied subscribers
NEEDS ¤ Decision engine
¤ Consume agnostic data sources ¤ Scalable ¤ Real-time
| 32 Copyright © 2014 | +1 877 571 5775 | [email protected]
Ca se s t udy : Cyber se cu r i t y a g row i ng mar ke t
¤ No longer an unorganized hacker world ¤ Innovation and technology ¤ Global economy ¤ Political support
$207 Billion
Entrepreneur.com
In 2012, U.S. Navy databases were hacked and 200,000 sailors’ information was put at risk.
| 33 Copyright © 2014 | +1 877 571 5775 | [email protected]
Cybe r A t ta c k s | DAMAGES
î 12.6 Million Americans were ID Theft victims last year
î 608,271,950 and growing records have been compromised due to security breaches since 2005
î 94% of healthcare organizations surveyed had at least one data breach in the past 2 years
î 1 in 4 data breach notification recipients became a victim of identity fraud
î 5 times more likely to be a fraud victim if your Social Security Number has been compromised in a data breach
| 34 Copyright © 2014 | +1 877 571 5775 | [email protected]
INTERNET SURVEILLANCE
InfoArmor Internet Surveillance uses bots to continuously monitor the Underground Economy to uncover compromised,
sensitive information. Whether it is personal identifying data or a medical insurance card, Internet Surveillance
uncovers breached data and alerts in real time.
What We Monitor:
¤ Malicious Command & Control Networks
¤ Black Market Forums
What is the Underground
Economy?
An ever-evolving complex of compromised machines, networks and web services identified by InfoArmor and leading cyber security firms.
¤ Phishing Networks
¤ Exploited Websites
¤ Known Compromised Machines & Servers
| 35 Copyright © 2014 | +1 877 571 5775 | [email protected]
INTERNET SURVEILLANCE
How We Monitor:
¤ Proprietary hardware and software solution
¤ Unparalleled alert accuracy (minimized false positives)
¤ Secure: separate reconnaissance and analysis efforts, plus no refined search queries
What We Monitor:
¤ Credentials, SSNs, names, addresses, emails and DOBs
¤ Wallet items (i.e. credit cards, medical insurance card)
INFOARMOR BOTS monitor UNDERGROUND ECONOMY
COMPROMISED DATA sent back to INFOARMOR
SENSOR compares compromised to subscriber data in secure environment, creating
ALERTS with 100% accuracy
X
| 36 Copyright © 2014 | +1 877 571 5775 | [email protected]
Case s tudy : S t reaming analy t i c s
SQLstream BENEFITS ¤ Ability to adapt to many data sources ¤ Real Time analysis and alerting ¤ Offset database load ¤ Data Hygiene prior to data warehousing
RESULTS ¤ Real-time actionable alerts ¤ Unity in Ingress Data points ¤ Dual Purpose solution
• Helps Compliance ¤ Plans to expand engagement
o f f l i n e o n l i n e
Damian Black
Email | [email protected]
Website | www.sqlstream.com
DOWNLOADS | http://www.sqlstream.com/downloads/
John L Myers Enterprise Management Associates Research Director [email protected]
Importance of Speed of Response in Big Data
© 2012 Enterprise Management Associates, Inc.
Speaker
John Myers joined Enterprise Management Associates in 2011 as senior analyst of the business intelligence (BI) practice area. John has 10+ years of experience working in areas related to business analytics in professional services consulting and product development roles, as well as helping organizations solve their business analytics problems, whether they relate to operational platforms, such as customer care or billing, or applied analytical applications, such as revenue assurance or fraud management.
John L Myers Enterprise Management Associates Research Director
JohnLMyers44 © 2013 Enterprise Management Associates, Inc. Slide 40
Disruptive Forces in Data Management: Changing the Speed of Business
25 35 45 55 65 75
© 2013 Enterprise Management Associates, Inc. Slide 41
Use Cases met with Big Data Implementations
• Speed of processing response • Combining data by structure • Pre-processing data • Utilization of streaming data • Staging structured data • Online archiving
Rogers, Myers and Devlin, "Big Data: Operationalizing the Buzz", Enterprise Management, http://research.enterprisemanagement.com/big-data-2013-webinar-nl.html
© 2013 Enterprise Management Associates, Inc. Slide 42
Top 5 Business Challenges Met with Big Data Projects
• Risk management • Fraud Analysis, Liquidity Risk Assessment
• Ad-hoc operational queries • Customer Relations Management
• Asset optimization • Staff Scheduling, Logistical Asset Planning
• Operational event and policy processing • Billing, Rating
• Campaign Optimization • Market Basket Analysis, Cross-sell/Up-sell Recommendation
• Clustering, social graph analysis • Grouping and Relationship Analysis, Geographic Optimization
Rogers, Myers and Devlin, "Big Data: Operationalizing the Buzz", Enterprise Management, http://research.enterprisemanagement.com/big-data-2013-webinar-nl.html
© 2013 Enterprise Management Associates, Inc. Slide 44
Building the Bridge between Operational Processes and Analytical Results
© 2013 Enterprise Management Associates, Inc. Slide 45
Hybrid Data Ecosystem 2013: From Requirements to Consumers
© 2013 Enterprise Management Associates, Inc. Slide 46
Questions
© 2013 Enterprise Management Associates, Inc. Slide 47
• This version of “streaming analytics” sounds a lot like “complex event processing.” How does SQLstream differentiate from those solutions?
• The open source community, such as Apache Hadoop, has been coming up with solutions to problems like streaming. What advantages does a proprietary solution like SQLstream have over these solutions?
• “Streaming analytics” appears to be well suited for the upcoming trends in the “location based services” in mobile telecom and “telematics” in automotive. Which use cases appear to have the best chances of success? Marketing activities such as “location coupons?” Operational optimization such as “managed highways?”
Questions
© 2013 Enterprise Management Associates, Inc. Slide 48
• What are the best types of datasets to be used in the world of “streaming analytics?” Structured big data or large volumes of single row event data (i.e., log information)? Formatted multi-row event data (i.e., JSON)? • What types of datasets should be avoided?
• What types of analytical techniques are best used with “streaming analytics?” Advanced analytical models associated with predictive or clustering algorithms? Rules-based, policy techniques (i.e., decision trees)? Simple descriptive analytics?
• What types of analytics techniques should be avoided?
Twitter Tag: #briefr
The Briefing Room
Upcoming Topics
www.insideanalysis.com
2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room
This Month: ANALYTICS
February: BIG DATA
March: CLOUD