37
Data, Data, Data,……. Nov 05, 2012

Big data trends_problems_v2

Embed Size (px)

DESCRIPTION

Big Data is going to explore - from 5 exabyte in 2010-11 to 50 Zettabyte in 2020. What will be things that will enable this? What will be data sources that will contribute to this? What problems we need to solve to enable this?

Citation preview

Page 1: Big data trends_problems_v2

Data, Data, Data,…….

Nov 05, 2012

Page 2: Big data trends_problems_v2

Size of all Internet

data 2011

Size of all Internet

data 2020

Page 3: Big data trends_problems_v2

Size of all Internet

data 2011

Size of all Internet

data 2020

70% Packaged

Goods Media

Page 4: Big data trends_problems_v2

90% UGC/Senso

r

a future view……

Page 5: Big data trends_problems_v2

User Generated Content (UGC)

Source - DOMO

Page 6: Big data trends_problems_v2

what will enable growth of user

generated content?

Page 7: Big data trends_problems_v2

some enabling technologies…….

Page 8: Big data trends_problems_v2

network bandwidth cheap storage cheap compute power user friendly devices

Page 9: Big data trends_problems_v2

network bandwidth

software defined network google fiber innovations related to:- SwitchesRoutersPackets sizecompressions

Page 10: Big data trends_problems_v2

cheap storage

• cheap storage - a forcing function• storage companies provide free storage

• in return, they have access to user data

• raw data is turned into boutique data• sold at premium to interested companies and advertisers

Page 11: Big data trends_problems_v2

cheap compute power

• Innovations on rack space• cheap, baremetal hardware• lowers TCO of servers• operational tasks become easier• allows companies to offer cloud

Page 12: Big data trends_problems_v2

user friendly devices

buttons free WYSIWYS(tore) connectivity – most important and a “given”

tendency to track family

Page 13: Big data trends_problems_v2

some data sources…..

Page 14: Big data trends_problems_v2

reality

show

s

Page 15: Big data trends_problems_v2

sensor data…

10 TB of Data/Engine/30 minutes 6 hour flight from NY to LA for Twin Engine 737 = 240 TB of Data/flight 28,537 Airliners in US Skies/day 6.5 Exabytes (6688 Petabytes/day)

Page 16: Big data trends_problems_v2

“……..within the next five years, sensor data will hit the crossover point with unstructured data generated by social media. From there, the sensor data will dominate by factors 10-to-20 times that of social media……

” - Stephen Brobst, CTO, Teradata

Page 17: Big data trends_problems_v2

online games

Pic – coolarcade.org

• ~225 million seventh-generation game consoles sold worldwide by early 2012• ~700 million Wii games, • 425 million PlayStation 3 games• 600 million Xbox 360 games. 

Page 18: Big data trends_problems_v2

GPS data

Innovations in Transportation ApplicationsMultiple sources:

• Computers Embedded in Vehicle• In-vehicle navigation systems• Drivers’ cell phones. • Communication networks• Third-party data like weather• Traffic

Pic – www.bmwusa.com

Page 19: Big data trends_problems_v2

intelligent roads (INTRO*)

• roads with sensors• determine traffic patterns• sustainable ways to route traffic• generate data for:-

• law enforcement• transportation• insurance companies• medical agencies

* INTRO – INTelligent ROads – a project of European Commission

Page 20: Big data trends_problems_v2

mobile devices of tomorrow……

Page 21: Big data trends_problems_v2

user generated content

• curated content • mashed content (pinterest like)• blogs• videos (own shows, personal videos, etc)• pics• collaboration – emails/IMs/ “Likes” etc• microblogs (twitter like)

Page 22: Big data trends_problems_v2
Page 23: Big data trends_problems_v2

another perspective………

Page 24: Big data trends_problems_v2

How much is ZB, anyway?

Page 25: Big data trends_problems_v2

all this leads to…..

Pic source – bigdatabytes.com

Page 26: Big data trends_problems_v2

BIG DATA

Page 27: Big data trends_problems_v2

big data characteristics…

3 V’s*

• volume

• velocity

• variety

* coined by Doug Laney of Gartner Inc

Page 28: Big data trends_problems_v2

big data problems…

3 I’s

• immediate – do something now!!

• intimidating – what if you don’t?

• ill-defined – what is it, anyway - Vance Loiselle, CEO, Sumo Logic

Page 29: Big data trends_problems_v2

big data skills……..

Page 30: Big data trends_problems_v2

analytics – no more an afterthought….

Page 31: Big data trends_problems_v2

analytic.NEXT

• near real time • new data sources• mobile • immediately actionable• big• agile• core of business

Page 32: Big data trends_problems_v2

impact on us…• data scientists lead the “Data Orchestra”• developers/product mgrs/DBAs/Ops will merge

• Data Techs will emerge• “behavior”, “intent” and “thought” targeting• hourly trends will be considered “Jurassic” old

Page 33: Big data trends_problems_v2

problems….

Page 34: Big data trends_problems_v2

storage…..• store Exabytes (Petabytes)• huge compression ratio (80% compression)• cheap storage (~ 10 cents/GB/month)• MTTF rate (High failure 8%)• distributed storage • storage over software defined networking• read compressed data• ETL

Page 35: Big data trends_problems_v2

servers…..• servers and storage merge?• special CPUs to handle compression?• encryption?• better cpu• bus speed

Page 36: Big data trends_problems_v2

analytics

• understand data• analytical skills• discover new ways of looking at data• new containers for data warehouses incldg data warehouses on cloud

• backup and recovery (should not be an issue)

Page 37: Big data trends_problems_v2