Upload
shreez79
View
216
Download
0
Embed Size (px)
Citation preview
8/9/2019 Big Data Cis
1/15
Big Data Analytics
Shreekant Kadam
XMBA - 58
8/9/2019 Big Data Cis
2/15
What are we going to
understand What is Big Data?
Why we landed up there?
To whom does it matter
Are we ready to handle it?
What are the concerns?
Tools and Technologies
8/9/2019 Big Data Cis
3/15
Simple to start
What is the maximum file size you
have dealt so far? Movies/iles/Streaming video that you have used?
What have you o!served?
What is the maximum download
speed you get?
Simple computation "ow much time to #ust transfer$
8/9/2019 Big Data Cis
4/15
What is !ig data?
%&very day' we create ($) *uintillion !ytes of data+ so much that ,-. of the data in the world todayhas !een created in the last two years alone$ hisdata comes from everywhere0 sensors used togather climate information' posts to social mediasites' digital pictures and videos' purchasetransaction records' and cell phone 12S signals to
name a few$his data is %big data$3
8/9/2019 Big Data Cis
5/15
"uge amount of data
There are huge volumes of data inthe world:
+ From the beginning of recordedtime until 2003,
+ We created 5 billion gigabytes (eabytes! of data"
+ #n 20$$, the same amount was
created every two days+ #n 20$3, the same amount of data
is created every $0 minutes"
8/9/2019 Big Data Cis
6/15
Big data spans three dimensions: Volume,
Velocity and Variety Volume: %nter&rises are awash with ever'growing data of all ty&es, easily amassing
terabyteseven &etabytesof information" Turn $2 terabytes of Tweets created each day into im&roved &roduct sentiment
analysis
)onvert 350 billion annual meter readings to better &redict &ower consum&tion
Velocity: *ometimes 2 minutes is too late" For time'sensitive &rocesses such ascatching fraud, big data must be used as it streams into your enter&rise in order to
maimi+e its value" *crutini+e 5 million trade events created each day to identify &otential fraud
naly+e 500 million daily call detail records in real'time to &redict customer churnfaster
The latest # have heard is $0 nano seconds delay is too much"
Variety: -ig data is any ty&e of data ' structured and unstructured data such as tet,sensor data, audio, video, clic. streams, log /les and more" ew insights are foundwhen analy+ing these data ty&es together"
1onitor $00s of live video feeds from surveillance cameras to target &oints ofinterest
%&loit the 04 data growth in images, video and documents to im&rove customersatisfaction
8/9/2019 Big Data Cis
7/15
inally4$
-ig' 6ata is similar to 7*mall'data but bigger
"" -ut having data bigger it re8uires di9erenta&&roaches:
Techni8ues, tools,architecture
with an aim to solve new &roblems
;r old &roblems in a betterway
8/9/2019 Big Data Cis
8/15
Whom does it matter 5esearch 6ommunity
Business 6ommunity 7 8ew tools' new capa!ilities' new infrastructure' new
!usiness models etc$'
9n sectors
Financial Services..
8/9/2019 Big Data Cis
9/15
he Social :ayer in an ;nstrumented ;nterconnected World
2+
billion
people
on the
Web by
end 2011
30 billion RFID
tags today (1.3B in 2005)
4.6
billion camea
phones
!old
!ide
100s of
millions
of GPS
enabled
de"ices
sold
ann#ally
76 million smat
metes in 200$%
200& by 201'
12+ TBs o t!eet data
e"ey day
25+ TBs o log data
e"ey day
?
T B s o (
d
a t a
e " e y
d a y
8/9/2019 Big Data Cis
10/15
What does Big Data trigger?
rom %Big Data and the We!0 Algorithms for Data ;ntensive Scala!le 6omputing3' 2h$D hesis' 1ianmarco
8/9/2019 Big Data Cis
11/15
ypes of tools typically used
in Big Data Scenario Where is the &rocessing hosted<
6istributed server=cloud
Where data is stored<
6istributed *torage (eg: ma+on s3!
Where is the &rogramming model<
6istributed &rocessing (1a& >educe!
?ow data is stored and indeed<
?igh &erformance schema free database What o&erations are &erformed on the data<
nalytic=*emantic @rocessing (%g">6F=;WA!
8/9/2019 Big Data Cis
12/15
When dealing with Big Data is
hard When the o&erations on data are com&le:
%g" *im&le counting is not a com&le&roblem"
1odeling and reasoning with data ofdi9erent .inds can get etremely com&le
Bood news with big'data:
;ften, because of the vast amount of
data, modeling techni8ues can getsim&ler (e"g", smart counting can re&lacecom&le model'based analytics!
as long as we deal with the scale"
8/9/2019 Big Data Cis
13/15
ime for thin
8/9/2019 Big Data Cis
14/15
Why Big7Data?
Cey enablers for the a&&earance and growth of7-ig'6ata are:
+ #ncrease in storageca&abilities
+ #ncrease in &rocessing
&ower+ vailability of data
8/9/2019 Big Data Cis
15/15
*I+,