10
Big data BIG DATA VS DATA WAREHOUSING A LOOK AT THE VALUE AND DIFFERENCES OF DATA WAREHOUSING AND BIG DATA Tshegofatso Mogomotsi

Big data vs datawarehousing

Embed Size (px)

Citation preview

Page 1: Big data vs datawarehousing

Big dataBIG DATA VS DATA WAREHOUSING

A LOOK AT THE VALUE AND DIFFERENCES OF DATA WAREHOUSING AND BIG DATA

Tshegofatso Mogomotsi

Page 2: Big data vs datawarehousing

The purpose of the presentation is to outline the value that Big data and Data warehousing can contribute into a business respectively. Differentiate the two concepts and their benefits.

Tshegofatso Mogomotsi2016

Page 3: Big data vs datawarehousing

Overview

What is Data warehousing, Big data, and Fast data Big data tools Use Case Summary of differences

Page 4: Big data vs datawarehousing

Defining Data warehousing, Big data and Fast data in business

Data warehousingData warehouses are usually used to correspond broad business data from various data sources to provide greater insight into the performance of a business. Data warehouses are different from regular databases in that databases are optimized to maintain strict accuracy of data by rapidly updating real-time data. Unlike relational databases, data warehouses are designed to give a long-range view of data over time and specialize in data gathering which allows for further processed like data mining (Informatica, 2016) Big dataBig data is defined by large or complex data sets that traditional data processing techniques and applications are inadequate. Challenges include analysis, storage, transfer, visualization, querying, updating, and information privacy. The term often refers simply to the use of predictive analytics, user behavior analytics, or certain other advanced data analytics methods that extract value from data. Fast dataBig data grows through a constant stream of incoming data. John Hugg, a software architect, proposes that instead of simply storing that data to be analyzed later, perhaps we've reached the point where it can be analyzed as it's ingested while still maintaining extremely high intake rates. Big data is not only measured by volume of data, it is also measured by volume in terms of time-velocity. Velocity represents working data, immediate status, or data with ongoing purpose. The best way to capture the value of incoming data is to react to it the instant it arrives. If you are processing incoming data in batches, you've already lost time and, thus, the value of the active data.

Page 5: Big data vs datawarehousing

Defining Data warehousing, Big data and Fast data in business

Deliver business value through the analysis of data

William H. Inmon, described a data warehouse as being a

subject-oriented, integrated, time-variant collection of data that

supports management's decision-making process.

Big data is technology capable of carrying large amounts of data

stored in an unstructured format. This data, when captured,

manipulated, and analyzed can help a corporation to gain useful insight.

Fast data is the application of big data analytics to smaller data sets in real-time in order to solve a problem or create business value. The goal of fast

data is to quickly gather and mine structured and unstructured data so that

action can be taken.

Page 6: Big data vs datawarehousing

Big data tools

Big Data

Below is a view of some the applications/tools used for Big data management and processing

Data Storage and ManagementClouderaMongoDBOracle Database(or the Oracle NoSQL Database)

Data cleaning toolsOpenRefineDataCleaner

Data mining tools – predictive analysisRapid MinerIBM SPSS ModelerOracle Data Miner GUI

Data analyticsOracle RBigML

Data visualizationTableauSilk

Page 7: Big data vs datawarehousing

Uses: Case study

Company ABC is a large South African shoe manufacturing company that also has retail stores across the African region. A manufacturer of various shoe types for the whole family. ABC annual turnover for the 2015/16 financial was 16.6 million.

The company is looking to increase their profit margin by 10 percent in the next 2017/18 financial year and to achieve this they recently invested in Big data infrastructure.

Page 8: Big data vs datawarehousing

Uses: Case study

Big data ABC recently recognized that there is an increasing amount of data which

is not captured in their operational databases such as clickstream logs, social feeds, customer support emails, location data from mobile devices and chat transcripts. Big data systems harness these new sources of data, and allow businesses to analyze and extract business value from these large data sets.

Example of how Big data systems can add value to ABC Using Big data tools, the BI team identifies customers that are active

on specific marathon websites, search information related to marathons/running, and engage with social feeds related to marathons/running. Then uses the data to predict that these customers may be running a marathon soon, then forward products and specials of running shoes to these customers.

Page 9: Big data vs datawarehousing

Uses: Case study

Data warehouse ABC’s data warehouse contains data from its company financials systems, its

customer marketing systems, its billing systems, its point-of-sales systems, and so on. Traditionally, data warehouses source data solely from other databases. The need for a data warehouse often becomes evident when analytic requirements become challenging for the ongoing performance of operational databases.

The data warehouse stores current and historical data and is used for creating analytical reports for knowledge workers throughout the company. Examples of reports could range from annual and quarterly comparisons and trends to detailed daily sales analysis.

The data warehouse provides the company with reliable, believable and accessible data that everyone in the company can rely on.

Even with a Big data initiative incorporated into the ABC’s business, the data warehouse - built upon a relational database, can continue to be the primary analytic database for storing much of a company’s core transactional data: financial records, customer data, point of-sale data and so forth.

Page 10: Big data vs datawarehousing

Summary of differences

Big data Data warehousing

Big data solution is a technology- a means to store and manage large amounts of data

Data warehousing is an architecture - a way of organizing data so that there is corporate credibility and integrity.

The Big data scope of data is beyond data found in the corporation (Web, sales, customer contact center, social media, mobile data).

An enterprise’s data warehouse contains data from its enterprise databases.

Big data applies an architecture that acquires data from multiple data sources, organizes and stores that data in a suitable format for analysis.

Data warehouses do not excel at handling raw, unstructured, or complex data.

Big data is measured by volume and velocity. A data warehouse is measured by volume.

If unlocked properly – data can contain much valuable information that can lead to better decisions that, in turn, can lead to more revenue, more profitability and increased market share.

Data warehouse provides a “single version of the truth” for decision making in the corporation. With a data warehouse there is an integrated, granular, historical single point of reference for data in the corporation.