Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
1 © Copyright 2015 EMC Corporation. All rights reserved.
LEVERAGING DATA LAKES TO MAXIMIZE IOT VALUE IOT: WHERE DOES ALL THE DATA GO?
2 © Copyright 2015 EMC Corporation. All rights reserved.
• I'm an Engineer
– I like to solve problems by building things
– BS in Computer Science
• 12 years industry experience – 12 years writing code for money
– 8 years developing Data Lake storage systems
• Expertise – Storage
– Network Protocols
– Computer Security
MY BACKGROUND
3 © Copyright 2015 EMC Corporation. All rights reserved.
* Source: IDC 2011
2005 2015 2010
1.8 trillion gigabytes of
data was created in 2011*
• More than 90% is
unstructured data
• Quantity doubles every 2
years
10,000
0
GB
of
Data
(I
N B
ILL
ION
S)
BIG DATA IS GETTING BIGGER
STRUCTURED DATA
UNSTRUCTURED DATA
©2014 Cloudera, Inc. All rights reserved.
4 © Copyright 2015 EMC Corporation. All rights reserved.
THE INTERNET OF THINGS IS EXPLODING
The impact of the IoT is already visible in the digital universe. Data just from embedded systems – the sensors and systems that monitor the physical universe – already accounts for 2% of the digital universe. By 2020 that will rise to 10%.
5 © Copyright 2015 EMC Corporation. All rights reserved.
IOT CREATING NEW OPPORTUNITIES FOR BUSINESSES
6 © Copyright 2015 EMC Corporation. All rights reserved.
1. IoT produces a lot of data
2. Which will continue to grow exponentially
3. Which comes from lots of things in lots of forms
4. And has tremendous value, but it must be analyzed
5. And the full value is not known up front
ASSUMPTIONS
Where does all the data go?
7 © Copyright 2015 EMC Corporation. All rights reserved.
WHAT IS A DATA LAKE?
"A single scalable repository, storing high fidelity data in its native format, that can be arbitrarily queried.”
8 © Copyright 2015 EMC Corporation. All rights reserved.
• Scalable is more important than big or small – You have some data today, you will need more capacity in
the future
• Migrating data is terrible! – Wasted time and effort
• There will always be upper limits – But limits should be infinite-in-practice for your workflow
– Capacity should be a matter of budget not of capability
CAPACITY SINGLE SCALABLE REPOSITORY
9 © Copyright 2015 EMC Corporation. All rights reserved.
• IoT means lots of devices, from different vendors
– Multiple data sources
– Multiple data formats
• Different operating systems
– Linux, Windows, iOS, Android, QNX, VxWorks, custom
INGEST HIGH FIDELITY NATIVE FORMAT
10 © Copyright 2015 EMC Corporation. All rights reserved.
• Different file access protocols
– SMB, NFS, FTP • files and directory trees
– Object, HTTP, REST • buckets, containers, and objects
• Authentication
– Local users
– Active Directory
– LDAP
INGEST HIGH FIDELITY NATIVE FORMAT
11 © Copyright 2015 EMC Corporation. All rights reserved.
Use a Database
1. Build a database with a rigid schema
2. Build an application to write data to that schema
3. Run queries
ANALYSIS (TRADITIONAL) ARBITRARILY QUERIED
Problems
• Tight coordination needed between all actors
• Full understanding of your goals needed up front
• Limited data fidelity
– Very structured, but not very broad
12 © Copyright 2015 EMC Corporation. All rights reserved.
Hadoop
• THE way to do Big Data Analytics
• Parallel processing of multiple data sets / formats
• Define schema as the data is queried
• Analyze anything, across all your data
ANALYSIS (NEXT GENERATION) ARBITRARILY QUERIED
13 © Copyright 2015 EMC Corporation. All rights reserved.
WHAT IS A DATA LAKE?
"A single scalable repository, storing high fidelity data in its native format, that can be arbitrarily queried.”
14 © Copyright 2015 EMC Corporation. All rights reserved.
EMC ISILON: SCALE-OUT NAS ARCHITECTURE
Gig-e 10 Gig-e Network
OneFS Operating Environment
Clients & Applications
RESTful API GET PUT POST DELETE
Client/Application Layer
Ethernet Layer Multi-Protocol
Protocols
SMB NFS
FTP HTTP
HDFS for
Hadoop
REST for
Object
Intra-cluster Communication
15 © Copyright 2015 EMC Corporation. All rights reserved.
Isilon scales from
16TB to 50PB
in a single file system, single volume cluster • Under 60 seconds to
scale with no downtime
MORE SCALABLE THAN TRADITIONAL STORAGE SYSTEMS EMC ISILON: MASSIVELY SCALABLE
16 © Copyright 2015 EMC Corporation. All rights reserved. 16
FILE
FILE
HPC
Backup/Archive
Analytics
Mobile
File Shares
IoT
EMC ISILON: INGEST
17 © Copyright 2015 EMC Corporation. All rights reserved.
• Only scale-out storage platform with native Hadoop integration.
• In-place analytics – Native integration speeds time to insight
• Certified – Hortonworks commercial Hadoop vendor
integration
• Consulting services – Map IoT data into storage and devise Hadoop
analytics jobs
EMC ISILON: ANALYSIS SCALE-OUT STORAGE WITH NATIVE HADOOP INTEGRATION
19 © Copyright 2015 EMC Corporation. All rights reserved.
EMC Public Safety Data Lake
20 © Copyright 2015 EMC Corporation. All rights reserved.
PUBLIC SAFETY DATA TRENDS NEED FOR SCALABLE DATA REPOSITORIES
Increased camera counts & longer video retention times
Body Camera Proliferation
Expanding City Wide Surveillance Systems
Evidence Management
Video Content Analysis has become essential
1
2
5
3
4
21 © Copyright 2015 EMC Corporation. All rights reserved.
PUBLIC SAFETY SYSTEMS STORAGE CAPACITIES PER CAMERA RESOLUTION
15 Days
30 Days
45 Days
60 Days
0
500
1000
1500
1.5Mbs6Mbps
7.7Mbps9.6Mbps
21Mbps
4CIF1080p
3MPixel5MPixel
10 Mpixel
23.3 97.2 120
160 350
46.6 190 240 320
700 70
290 360 480
1050
94 390 480 640
1400
Capacity (
TB)
For
100 C
am
era
Continuous R
ecord
@ 1
5fp
s
Camera Resolutions and Average Bandwidths
22 © Copyright 2015 EMC Corporation. All rights reserved.
Public Safety Data
Lake
Body Cameras
CCTV
Drones
Satellite Images
License Plate
Capture Audio
In-Car Video
Internet of
Things
Evidence
Pools of Data
23 © Copyright 2015 EMC Corporation. All rights reserved.
Analytics
Security
Application Integration
Body Camera
s
CCTV
Drones
Satellite Images
License Plate
Capture Audio
In-Car Video
Internet of
Things
Evidence
Public Safety Data Lake
24 © Copyright 2015 EMC Corporation. All rights reserved.
“We needed a scalable storage architecture to support the CitySafe project, and the single point of management and load balancing capabilities of Isilon made it a perfect fit for this project.”
Brisbane City Council Australia’s largest council turns to EMC to protect employees and visitors to City Hall
Challenge
Store and make available evidence quality video from cameras at Brisbane City Council's restored City Hall building
Solution
EMC Isilon NL Series
EMC VNXe
Results
Delivered 100% availability
Provided the council and police with high-resolution video and images which can be used in court
Standardized on a world-class enterprise storage platform with robust support
Applications
Genetec Security Centre
PAUL RISHMAN Corporate Security Manager
25 © Copyright 2015 EMC Corporation. All rights reserved.
“The challenge for IT is making sure investigators can quickly get their video no matter how big the data stores have become. A solution that scales without losing performance is imperative, and Isilon has definitely met both those needs.”
Norman Oklahoma Police Department Oklahoma police force fights crime with EMC Isilon and MediaSolv Evidence Management
Challenge
Growing use of video surveillance driving massive data growth
Legacy storage nearing maximum capacity
Solution
EMC Isilon X-series
EMC Isilon SmartQuotas
EMC Isilon SmartConnect
VMware vSphere
Results
Gained scalability and performance to pursue leading-edge video projects
Improved law enforcement with fast, reliable access to evidentiary video
Simplified control over storage usage across different video systems
Increased efficiency of managing fast-growing video assets
Applications
Genetec
MediaSolv
Microsoft SQL Server
KARI MADDEN Network Support Supervisor