Upload
tachyonnexus
View
169
Download
1
Tags:
Embed Size (px)
Citation preview
Bin Fan, Tachyon NexusJuly 19, 2015 @ Tachyon Workshop
tachyon-project.org
A Reliable Memory-Centric Distributed
Storage System
• Founded by Tachyon creators and top contributors
• $7.5 million Series A from Andreessen Horowitz
• Committed to Tachyon Open Source
• www.tachyonnexus.com
2
3
Outline
• Overview–Motivation– Tachyon Architecture– Using Tachyon
• Open Source– Status– Production Use Cases
• Roadmap
4
Outline
• Overview–Motivation– Tachyon Architecture– Using Tachyon
• Open Source– Status– Production Use Cases
• Roadmap
5
Started From UCB AMPLab
Berkeley Data Analytics Stack (BDAS)
Cluster manager Parallel computation framework
Reliable, distributed memory-centric storage system
6
7
Why Tachyon?
Memory is Fast
• RAM throughput increasing exponentially
• Disk throughput increasing slowly
8
Memory-locality key to interactive response times
Memory is Cheaper
source: jcmit.com9
Realized by many…
10
11
Is theProblem Solved?
12
Missing a Solution for the Storage Layer
An Example: -
• Fast, in-memory data processing framework– Keep one in-memory copy inside JVM– Track lineage of operations used to derive
data– Upon failure, use lineage to recompute
datamap
filter map
joinreduc
e
Lineage Tracking
13
Issue 1
14
Data Sharing is the bottleneck in analytics
pipeline:Slow writes to diskSpark Job1
Spark memblock manager
block 1
block 3
Spark Job2
Spark memblock manager
block 3
block 1
HDFS / Amazon S3block
1block
3
block 2
block 4
storage engine & execution enginesame process(slow writes)
Issue 1
15
Spark Job
Spark memblock manager
block 1
block 3
Hadoop MR Job
YARN
HDFS / Amazon S3block
1block
3
block 2
block 4
Data Sharing is the bottleneck in analytics
pipeline:Slow writes to diskstorage engine &
execution enginesame process(slow writes)
Issue 2
16
Spark Task
Spark memoryblock manager
block 1
block 3
HDFS / Amazon S3block 1
block 3
block 2
block 4
execution engine &
storage enginesame process
Cache loss when process crashes
Issue 2
17
crash
Spark memoryblock manager
block 1
block 3
HDFS / Amazon S3block 1
block 3
block 2
block 4
execution engine &
storage enginesame process
Cache loss when process crashes
HDFS / Amazon S3
Issue 2
18
block 1
block 3
block 2
block 4
execution engine &
storage enginesame process
crash
Cache loss when process crashes
HDFS / Amazon S3
Issue 3
19
In-memory Data Duplication &
Java Garbage CollectionSpark Task1
Spark memblock manager
block 1
block 3
Spark Task2
Spark memblock manager
block 3
block 1
block 1
block 3
block 2
block 4
execution engine &
storage enginesame process(duplication & GC)
Tachyon
Reliable data sharing atmemory-speed within and
across cluster frameworks/jobs
20
Technical Overview
Ideas• A memory-centric storage
architecture• Push lineage down to storage layer• Manage tiered storage
Facts• One data copy in memory• Re-computation for fault-tolerance
21
Apache
Spark
Apache MR
Apache
HBaseH2O Apach
e FlinkImpal
a
S3Gluste
rFSHDFS Swift NFS Ceph ……
……
Stack
22
Tachyon Memory-Centric Architecture
23
Tachyon Memory-Centric Architecture
24
Lineage in Tachyon
25
Issue 1 revisited
26
Memory-speed data sharing
among jobs in different frameworks
execution engine &
storage enginesame process(fast writes)
Spark Job
Spark mem
Hadoop MR Job
YARN
HDFS / Amazon S3block
1block
3
block 2
block 4
HDFSdisk
block 1
block 3
block 2
block 4Tachyonin-memory
block 1
block 3 block 4
HDFS / Amazon S3block 1
block 3
block 2
block 4
Tachyonin-memory
block 1
block 3 block 4
Issue 2 revisited
27
Spark Task
Spark memoryblock manager
execution engine &
storage enginesame process
Keep in-memory data safe,even when a job crashes.
Issue 2 revisited
28
HDFSdisk
block 1
block 3
block 2
block 4
execution engine &
storage enginesame process
Tachyonin-memory
block 1
block 3 block 4
crash
HDFS / Amazon S3block 1
block 3
block 2
block 4
Keep in-memory data safe,even when a job crashes.
Issue 3 revisited
29
No in-memory data duplication,
much less GCSpark Task
Spark mem
Spark Task
Spark mem
HDFS / Amazon S3block
1block
3
block 2
block 4
execution engine &
storage enginesame process(no duplication & GC) HDFS
diskblock 1
block 3
block 2
block 4Tachyonin-memory
block 1
block 3 block 4
Comparison with In-Memory HDFS
30
How easy / hard to use Tachyon?
31
Spark/MapReduce/Shark without Tachyon
• Sparkscala> val file = sc.textFile(“hdfs://ip:port/path”)
• Hadoop MapReduce$ hadoop jar hadoop-examples-1.0.4.jar wordcount hdfs://localhost:19998/input hdfs://localhost:19998/output
• SharkCREATE TABLE orders_cached AS SELECT * FROM orders;
32
Spark/MapReduce/Sharkwith Tachyon
• Sparkscala> val file = sc.textFile(“tachyon://ip:port/path”)
• Hadoop MapReduce$ hadoop jar hadoop-examples-1.0.4.jar wordcount tachyon://localhost:19998/input tachyon://localhost:19998/output
• SharkCREATE TABLE orders_tachyon AS SELECT * FROM orders;
33
Outline
• Overview–Motivation– Tachyon Architecture– Using Tachyon
• Open Source– Status– Production Use Cases
• Roadmap
34
Open Source Status
• Started at UC Berkeley AMPLab in Summer 2012
• Apache License 2.0, Version 0.7 (July 2015)
• Deployed at > 50 companies (July 2014)• 30+ Companies Contributing• Spark/MapReduce/Flink applications can run
without code change
35
36
Contributors Growth
v0.4Feb ‘14
v0.3Oct ‘13
v0.2Apr ‘13
v0.1Dec ‘12
v0.6Mar ‘15
v0.5Jul ‘14
v0.7Jul ‘15
1 3
15
30
46
70
100+
37
Codebase Growth
v0.4Feb ‘14
v0.3Oct ‘13
v0.2Apr ‘13
v0.6Mar ‘15
v0.5Jul ‘14
v0.7Jul ‘15
465commits
696commits
1080commits
1610commits
2884commits
4969commits
Open Community
38
Berkeley Contrib-utors
Non-Berkeley Contributors
Thanks to Our Contributors!Aaron DavidsonAbhiraj ButalaAchal SoniAlbert ChuAli GhodsiAndrew AshAnurag KhandelwalAslan BekirovBill ZhaoBin FanBradley ChildsCalvin JiaCarson WangChao ChenCheng ChangCheng HaoColin Patrick McCabeDan Crankshaw
Darion YaphetDavid CapwellDavid ZhuDina LeventolDu LiFei WangGene PangGerald ZhangGrace HuangHaoyuan LiHenry SaputraHobin YoonHuamin ChenJacky LiJey KottalamJingxin FengJoseph TangJuan Zhou
Jun AokiKun XuLukasz JastrzebskiLuogan KunManu GoyalMark HamstraMingfei ShiMubarak SeyedNan DunNick LanhamOrcun SimsekPengfei XuanQianhao DongQifan PuRamaraju IndukuriRaymond LiuRob VesseRobert Metzger
Rong GuSean ZhongSeonghwan MoonShaoshan LiuShivaram VenkataramanShu PengSrinivas ParayyaTao WangThu KyawTimothy St. ClairVaishnav KovvuriVikram SreekantiXi LiuXiaomeng HuangXiaomin ZhangXing LinYi LiuZhao Zhang
39
Tachyon Usage
40
Under Filesystem Choices(Big Data, Cloud, HPC, Enterprise)
41
Use Case: Baidu
• Framework: SparkSQL• Under Storage: Baidu’s File System• Storage Media: MEM + HDD• 100+ nodes deployment• 1PB+ managed space• 30x Performance Improvement
More Details: www.meetup.com/Tachyon42
Use Case: a SAAS Company
• Framework: Impala
• Under Storage: S3
• Storage Media: MEM + SSD
• 15x Performance Improvement
43
Use Case: an Oil Company
• Framework: Spark
• Under Storage: GlusterFS
• Storage Media: MEM only
• Analyzing data in traditional storage
44
Use Case: a SAAS Company
• Framework: Spark
• Under Storage: S3
• Storage Media: SSD only
• Elastic Tachyon deployment
45
Outline
• Overview–Motivation– Tachyon Architecture– Using Tachyon
• Open Source– Status– Production Use Cases
• Roadmap
46
New Features
• Lineage in Storage (alpha)• Tiered Storage (alpha)
47
New Features
• Lineage in Storage (alpha)• Tiered Storage (alpha)• Data Serving• Support for New Hardware• …• Your New Feature!
48
49
Tachyon’s Goal?
Distributed Memory-Centric Storage:
Better Assist Other Components
Welcome Collaboration!
50
JIRA New Contributor Tasks
Apache
Spark
Apache MR
Apache
HBaseH2O Apach
e FlinkImpal
a
S3Gluste
rFSHDFS Swift NFS Ceph ……
……
• Website: http://tachyon-project.org
• Github: https://github.com/amplab/tachyon
• Meetup: http://www.meetup.com/Tachyon
• Training program: coming soon
• Tachyon Nexus is hiring
• News Letter Subscription: http://goo.gl/mwB2sX
• Email: [email protected]