12
Pat McGarry Ryft Systems, Inc. Overcoming the Top Five Hurdles to Real-Time Analytics

Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

  • Upload
    ryft

  • View
    81

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

Pat McGarryRyft Systems, Inc.

Overcoming the Top Five Hurdles to Real-Time Analytics

Page 2: Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

Information—the fuel of business—is trapped in analysis platforms built on 70-year

old architectures.

Page 3: Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

Real-time insights as events occur, close to the source of data

Analysis of data from a range of IoT devices—video, mobile, batch stores, etc.—together

Ultra small & efficient analytics infrastructure Easy to deploy, use & maintain systems Low operational costs No security or performance trade-offs

IoT is exacerbating the widening data analytics technology divide.

REQUIREMENTS

Persistent compute/IO/storage bottlenecks Data analyzed in silos Data movement & ETL delays Sprawling inefficient analytics infrastructures Persistent data privacy & security issues

REALITY

Page 4: Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

T H E C H A L L E N G E

Complex, Closed Systems vsLow Performance Open Source Software

Closed analytics systems are expensive, hard to use and require huge teams to implement

Open source frameworks are easier to use, but their performance is limited by the commodity x86 servers they run on

Organizations have been forced to sacrifice performance or simplicity

Page 5: Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

T H E C H A L L E N G E

Slow Networking Speeds That Extend Data Transport Times

Current infrastructures do not have the power or efficiency to be put at the network’s edge

Data networking speeds can be slow or unreliable and have a drastic impact on data analytics speeds

Page 6: Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

T H E C H A L L E N G E

Time Consuming ETL and Indexing Bottlenecks

Traditional x86-based architectures require lengthy Extract, Transform and Load (ETL) and Indexing processes

These processes balloon the data size to an unreasonable degree

Data preparation time often means the difference between actionable insights or poor business decisions

Page 7: Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

T H E C H A L L E N G E

Complex, and Sometimes Impossible, Analytic Functions

Analytics functions—like fuzzy search—often require more or different computing power than is available in today’s analytics infrastructures

Traditional analytics ecosystems require massive indexes and data preparation functions

The combination of data preparation time and analysis limitations don’t allow for real-time analytics that capture all relevant insights

Page 8: Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

T H E C H A L L E N G E

Costly, Complex and Inefficient x86-based Clusters

Hardware bottlenecks still stifle data analytics performance

Required data processing, indexing, data sharding and other bottlenecks inherently slow down analytics

Cluster complexity can lead to inferior data center infrastructures that do not provide real-time performance

Page 9: Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

Heterogeneous (Hybrid) Computing is the solution…

SOURCES: BLOOMBERG BUSINESS, THE PLATFORM

Heterogeneous or hybrid computing refers to systems that use more than one kind of processor or cores. These systems gain performance or energy efficiency not just by adding the same type of processors, but by adding dissimilar processors, usually incorporating specialized processing capabilities to handle particular tasks.

Page 10: Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

…because optimal performance & efficiency demands the right “engine” for the job.

CPUs FPGA• General purpose

computing

• Sequential in nature

• Non-deterministic performance

• Interrupts• Memory

allocation

• Not general purpose and can be reprogramed via firmware

• Best at data-heavy analysis such as Search, fuzzy search, image and video analysis, deep learning

• Inherently massively parallel to give more output with less power

GPUs• Some general

purpose computing

• Can excel at certain complex algorithms

• Generally more parallel than CPUs, since GPUs have more cores

Page 11: Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

Performance

CPU FPGAGPU

Open API

CPU FPGAGPU

Coupled with an open, easy-to-use approach with business-centric, compute-agnostic open API.

Page 12: Strata + Hadoop World San Jose Presentation: Overcoming the Top Five Hurdles to Real-time Analytics

Questions?Visit Ryft’s at booth #1409

Pat McGarry [email protected]