19
Alluxio (formerly Tachyon): Accessing Data Anywhere with Unified Namespace Jiri Simsa June 15, 2016 @ Alluxio Meetup (hosted by Intel)

Accessing Data Anywhere with Unified Namespace

Embed Size (px)

Citation preview

Page 1: Accessing Data Anywhere with Unified Namespace

Alluxio (formerly Tachyon):Accessing Data Anywhere with Unified Namespace

Jiri Simsa

June 15, 2016 @ Alluxio Meetup (hosted by Intel)

Page 2: Accessing Data Anywhere with Unified Namespace

About Me

• Software Engineer @ Alluxio, Inc.

• PMC Member and Maintainer of Alluxio Open Source Project

• Ph.D. from Carnegie Mellon University (Parallel Data Lab)

• Worked at Google before joining Alluxio

• Twitter: @jsimsa, Github: jsimsa

2

Page 3: Accessing Data Anywhere with Unified Namespace

Outline

• Motivation

• Unified Namespace

• Use Cases

• Demo

3

Page 4: Accessing Data Anywhere with Unified Namespace

Big Data Ecosystem

4

Page 5: Accessing Data Anywhere with Unified Namespace

Big Data Ecosystem

5

Page 6: Accessing Data Anywhere with Unified Namespace

Big Data Ecosystem

6

Page 7: Accessing Data Anywhere with Unified Namespace

Alluxio Benefits

• Future-proofing your applications–applications can communicate with different storage systems, both existing and new, using the same namespace and interface–seamless integration between applications and new storage systems enables faster innovation

• Enabling new workloads–one-time effort to enable an application to access many different types of storage systems and a storage system to be accessed by many different types of applications

7

Page 8: Accessing Data Anywhere with Unified Namespace

Outline

• Motivation

• Unified Namespace

• Use Cases

• Demo

8

Page 9: Accessing Data Anywhere with Unified Namespace

Unified Namespace

an abstraction that makes it possible for

applications to access different storage

systems through the same interface

9

Page 10: Accessing Data Anywhere with Unified Namespace

Transparent Naming

•Operations over persisted Alluxio objects

mapped transparently to underlying storage

•Alluxio paths are preserved in storage layer

Alluxio Storage System (HDFS, S3, …)

alluxio://host:port/

Data Users

Reports Sales Alice Bob

hdfs://host:port/

Data Users

Reports Sales Alice Bob

10

Page 11: Accessing Data Anywhere with Unified Namespace

Multiple Storage Systems

•Unified namespace for multiple data sources

•Sharing of data across storage systems

•API for on-the-fly mounting / unmounting

AlluxioStorage System A

alluxio://host:port/

Data Users

Alice Bob

hdfs://host:port/

Users

Alice Bob

Storage System B

s3://host/bucket

Reports SalesReports Sales

11

Page 12: Accessing Data Anywhere with Unified Namespace

Outline

• Motivation

• Unified Namespace

• Use Cases

• Demo

12

Page 13: Accessing Data Anywhere with Unified Namespace

Multiple Storage / Compute

13

Page 14: Accessing Data Anywhere with Unified Namespace

Changing Storage Backend

14

Page 15: Accessing Data Anywhere with Unified Namespace

Changing Storage Backend

15

Page 16: Accessing Data Anywhere with Unified Namespace

Outline

• Motivation

• Unified Namespace

• Use Cases

• Demo

16

Page 17: Accessing Data Anywhere with Unified Namespace

Resources

• Alluxio Project: http://www.alluxio.org

• Development: https://github.com/Alluxio/alluxio

• Meet Friends: http://www.meetup.com/Alluxio

• Alluxio, Inc.: http://www.alluxio.com

• Contact us: [email protected]

Page 18: Accessing Data Anywhere with Unified Namespace

Backup Slides

18

Page 19: Accessing Data Anywhere with Unified Namespace

Architecture Overview

19