Big Data & the importance of Data Science

Preview:

Citation preview

Big Data & the importance of Data Science

18 december 2014

@wimvanleuven wim@bigboards.io

1

2

http://www.slideshare.net/kuonen/big-tent-bddsunigenov2014

–Edd Dumbill

“Big data is data that exceeds the processing capacity of conventional database systems.

The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures.”

3

http://radar.oreilly.com/2012/01/what-is-big-data.html

What is Big Data?

The 3 V’s of Big Data4

• Volume

• Velocity

• Variety

• (Veracity)

…too big…5

IOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIO

… moves to fast …6

… doesn’t fit …7

… what?8

New tools and technologies to store and process all data on a cluster of commodity hardware so that the system acts as one, is

resilient and scales linearly.

9

What is Big Data? — revisited

So what?10

the data lake is a large data pool in which the schema and data requirements are not defined

until the data is queried, processed, analysed or delivered as information to the end-user

–???

“We don’t do Hadoop because we have Big Data; we do Big Data because we have

Hadoop.”

11

So what?

–Matt Ehrlichman

“In the years ahead, the same power that big data awards enterprise companies will be the

norm for small business.”

12

So what?

http://blogs.wsj.com/accelerators/2014/10/31/matt-ehrlichman-big-data-for-small-firms/

13

What does Big Data enable?

• Combine data from within and without your organisation

• Build new products and services

• Analyse all data (e.g. 5TB historic event data at rest in Oracle db)

Big Data is no panacea14

• First decide what problem you want to solve; pick a real business problem to add immediate value

• Start small, the technology is made for linear scalability (a 3-node cluster is a cluster!)

• Then become lean: learn through experimentation

Big Data challenges• Beware of hype, Big Data - washing and fad

• Tech infancy

• IT | Biz

• Data is hard

• Lack of skills! shameless self plug: BigBoards!

15

Big Data opportunity

• Big Data is here to stay

• Vendor market is HUGE and will grow massively as Big Data will blend in within the datacenter

• However, the Practitioner market can deliver EXPONENTIALLY more value

16

17

It is time to band together and build these systems that deliver this kind of value

for fun

for profit for good

for Belgium?

Call for Action

https://www.ted.com/talks/susan_etlinger_what_do_we_do_with_all_this_big_data

“Data doesn't create meaning. We do.”

–Susan Etlinger

18

Data Science FTW

Recommended