Big Data: Its Characteristics And Architecture Capabilities

ByAshraf Uddin

South Asian University(http://ashrafsau.blogspot.in/)

What is Big Data?

Big data refers to large datasets that are challenging to store, search, share, visualize, and analyze.

“Big Data” is data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it…

The Model of Generating/Consuming Data has Changed

Old Model: Few companies are generating data, all others are consuming data

New Model: all of us are generating data, and all of us are consuming data

Do we really need Big Data?

For consumer :Better understanding of own behavior Integration of activities Influence – involvement and recognition

For companies :Real behavior-- what do people do, and what do they value? Faster interaction Better targeted offers Customer understanding

Characteristics of Big Data

1. Volume (Scale)

2. Velocity (Speed)

3. Varity (Complexity)

Volume

Velocity

• Data is being generated fast and need to be processed fast

• Online Data Analytics

• Late Decision leads missing opportunity

Varity

• Various formats, types, and structures

• Text, numerical, images, audio, video, sequences, time series, social media data, multi-dim arrays, etc…

• Static data vs. streaming data • A single application can be

generating/collecting many types of data

• To extract knowledge all these types of data need to linked together

Generation of Big Data

Scientific instruments(collecting all sorts of data)

Social media and networks(all of us are generating data)

Sensor technology and networks(measuring all kinds of data)

Why Big Data is Different?

For example, an airline jet collects 10 terabytes of sensor data for every 30 minutes of flying time.

Compare that with conventional high performance computing where New York Stock Exchange collects 1 terabyte of structured trading data per day.

Conventional corporate structured data sized in terabytes and petabytes. Big Data is sized in peta-, exa-, and soon perhaps, zetta-bytes!

Why Big Data is Different?

The unique characteristics of Big Data is the manner in which value is discovered.In conventional BI, the simple summing of a known value reveals a result In Big Data, the value is discovered through a refining modeling process:

make a hypothesiscreate statistical, visual, or semantic modelsvalidate, then make a new hypothesis.

Use cases for Big Data Analytics

A Big Data Use Case: Personalized Insurance Premium

an insurance company wants to offer to those who are unlikely to make a claim, thereby optimizing their profits.

One way to approach this problem is to collect more detailed data about an individual's driving habits and then assess their risk.

to collect data on driving habits utilizing sensors in their customers' cars to capture driving data, such as routes driven, miles driven, time of day, and braking abruptness.

This data is used to assess driver risk; they compare individual driving patterns with other statistical information, such as average miles driven in same state, and peak hours of drivers on the road.

Driver risk plus actuarial information is then correlated with policy and profile information to offer a competitive and more profitable rate for the company

The result A personalized insurance plan.

These unique capabilities, delivered from big data analytics, are revolutionizing the insurance industry.

To accomplish this task:a great amount of continuous data must be collected, stored, and correlated.

Hadoop is an excellent choice for acquisition and reduction of the automobile sensor data.

Master data and certain reference data including customer profile information are likely to be stored in the existing DBMS systems

a NoSQL database can be used to capture and store reference data that are more dynamic, diverse in formats, and change frequently.

Data Realm Characteristics

Big Data Architecture Capabilities

Storage and Management Capability

Database Capability

Processing Capability

Data Integration Capability

Statistical Analysis Capability

Hadoop Distributed File System (HDFS)

highly scalable storage and automatic data replication across three nodes for fault tolerance

Cloudera Manager gives a cluster-wide, real-time view of nodes and services running; provides a single, central place to enact configuration changes across the cluster

Database Capability

Database Capability Oracle NoSQL

Dynamic and flexible schema design High performance key value pair database.

Apache HBase Strictly consistent reads and writesAllows random, real time read/write access

Apache Cassandra Fault tolerance capability is designed for every nodeData model offers column indexes with the performance of log-structured updates, materialized views, and built-in caching

Apache Hive Tools to enable easy data extract/transform/load (ETL)

Query execution via MapReduce

Database Capability

MapReduce Break problem up into smaller sub-problems Able to distribute data workloads across thousands of nodes

Apache Hadoop Leading MapReduce implementation Highly scalable parallel batch processing Writes multiple copies across cluster for fault tolerance

Database Capability

Exports MapReduce results to RDBMS, Hadoop, and other targets

Connects Hadoop to relational databases for SQL processing

Optimized processing with parallel data import/export

Database Capability

Programming language for statistical analysis

Oracle R Enterprise allows reuse of pre-existing R scripts with no modification

Big Data Architecture

Traditional Information Architecture Capability

Big Data Information Architecture Capability

Conclusion

Today’s economic environment demands that business be driven by useful, accurate, and timely information.

the world of Big Data is a solution to the problem.

there are always business and IT tradeoffs to get to data and information in a most cost-effective way.

References

1. Big Data Analytics Guide: Better technology, more insight for the next generation of business applications, SAP

2. Oracle Information Architecture: An Architect’s Guide to Big Data

3. http://www.csc.com/insights/flxwd/78931-big_data_universe_beginning_to_explode

4. http://www.techrepublic.com/blog/big-data-analytics/10-emerging-technologies-for-big-data/280

5. http://www.idc.com/

6. From Database to Big Data. Sam Madden (MIT)

Big Data: Its Characteristics And Architecture Capabilities

Education

IT Infrastructure Capabilities and Business Process Improvements- Association With IT Governance Characteristics

Agile & Business Architecture: Capabilities as a Producttcbaf.org/.../2017/11/agile-and-biz-arch-capabilities-as-a-product.pdf · 0 Agile & Business Architecture: Capabilities as

MEP-831 GENERATOR CHARACTERISTICS ......MEP-831 GENERATOR CHARACTERISTICS, CAPABILITIES, COMPONENTS, OPERATIONS AND PMCS CLASS 1. MEP-831A CHARACTERISTICS AND CAPABILITIES. a. The

ARCHITECTURE AND CAPABILITIES OF A DATA WAREHOUSE …

Exploring characteristics and transformational ... · RESEARCH PAPER Exploring characteristics and transformational capabilities of InsurTech ... Its disruptive nature leads to and

Im Capabilities and Architecture

ESRI Cartography: Capabilities & Trends Cartography: Capabilities and Trends J-9214 Client–Server Architecture The ESRI architecture supports a network of clients and servers accessing

Architecture and Performance characteristics of a PostgreSQL implementation … · 2015-09-16 · Architecture and Performance characteristics of a PostgreSQL implementation of the

Smoothcube Presents: Business Architecture - IIBA · Smoothcube Presents: Business Architecture Tuesday, October 10, 2017 ... Business Capabilities Capabilities Using capabilities

Characteristics of Romanesque Architecture

An Implementable NGN Architecture and Its Capabilities

› dep-architecture... Formal characteristics of vernacular architecture in …Formal characteristics of vernacular architecture in Erbil city and other Iraqi cities Dr. Mahmood Ahmed

3D Graphics Capabilities and Architecture

EAdirections Latest Trends in Enterprise Architecture, Business Architecture, Capabilities Analysis and EA Certifications

ISDN Lecture 5 Paul Flynn 1. Functional Architecture 2 High-layer Capabilities TE TE or Service Provider Local Functional Capabilities Broadband Capabilities

THE ARCHITECTURE AND DESIGN OF ORGANIZATIONAL CAPABILITIES

3ReviewSpecial Issue on capabilities and resources …gguizzardi/archimate-capabilities(1).pdfModeling Resources and Capabilities in Enterprise Architecture: A Well-Founded Ontology-Based

Business Capabilities Centric Enterprise Architecture - … · · 2017-08-26Business Capabilities Centric Enterprise Architecture 33 Componentization breaks down each enterprise

Requirements and capabilities for an evolving NGN ......architecture FEs are able to support these capabilities and associated requirements (architecture FEs and related protocol specifications)

The New Generation of ITinvest Trading System Architecture and Capabilities