1
Design of Cloud Management Layerfor High-Performance File Transfer
高效能檔案傳輸之雲端層設計
2
Outline
• Introduction• Background• System Design• Implementation• Experiments
3
Introduction• Motivation
– File Transfer with load sharing and fault tolerance.
Clients
Cloud Management Layer
Cloud Storage
4
Introduction
User(Client) Management Layer Cluster Storage Server
UploadFiles
• Clients can upload the files to the cloud, and promise your file cannot be lose .
Files Download
5
Introduction (Cont.)
• Surveillance Application
Size:0.2~0.5 GB/1hr
Storage:2TB
Device x 30Host
2000G(2T) / 30 = 67G
67G / 0.3G = 223h
223h / 24h = 9.3 day
6
Introduction (Cont.)
• Used Platforms and APIHadoop
• Hadoop Distributed File System (HDFS)– File Storage
• Hbase– User Authentication
Socket• File Data
– Transmissions
7
Outline
• Introduction• Background• System Design• Implementation• Experiments
8
Background (Cont.)
What is Cloud Computing?• Scalable Computing and Storage Resource.
Cloud Source Models
Software as a service(SaaS)
Platform as a service(PaaS)
Infrastructure as a Service(IaaS)
Google docGoogle talk
Dropbox
Google App EngineWindows Azure
hadoop
AWS EC2IBM SmartCloud
Hinet hicloud CaaS
Cloud Computing
9
Background (Cont.)Hadoop
DataBase
File System
Pig Dataflow language and parallel execution Framework
Hive Data warehouse infrastructure
ZooKeeper Distributed coordination service
Chukwa System for collecting management data
Avro Data serialization system
• The other components of Hadoop
Cloud App
Hbase MapReduce
Hadoop Distributed File System(HDFS)
Parallel Processing
10
Background (Cont.)What is “Hadoop Distributed File System (HDFS)”?
ClusterStorageServer
Datanode PC
Clients
Namenode
Transmission
11
Background (Cont.)
• Hbase have high availability, high performance, and high expansion flexibility.
Table: ‘t1’
row-key'f1' 'f2' 'fn' column-family
‘c1’
‘c2’
‘c3’
‘c4’
* column-quantifier
r1 v1v2
r2 v3v4
12
Outline
• Introduction• Background• System Design• Implementation• Experiments• Conclusions• Reference
13
System Design(Cont.)
• Components– Manager Selection– Authentication– Synchronization– Cloud-Based File Transfer
Client
Hadoop
Hbase HDFS
Management Layer
Node3 NodenNode2Node1
14
System Design(Cont.)
• Manager Selection1. Connect Management layer and
perform load balance
2. Collect memory load information from each nodes
3. Select the lowest memory load as connection port(Nodei)
4. Connect with Nodei5. Authentication
1.Connect Management Layer and perform load balance
3.Select the lowest memory load as connection port(Nodei)
2.Collect memory load information from each nodes
5.Authentication
4.Connect with Nodei
Failure
Success
Client
15
System Design(Cont.)
• Authentication1. Key in account / password
2. Authentication(Hbase verification)
3. Synchronization
Manager Selection
2.Authentication(Hbase verification)
1.Key in account / password
Failure
3.Synchronization
Success
16
System Design(Cont.)
• Synchronization1. HDFS files synchronization
in client side
2. Check files existence
3. Cloud-Based files transfer
4. Background Monitoring
1.HDFS filessynchronization
in client side
2.Check files existence3.Cloud-Based files transfer
No
Authentication
Background Monitoring
Yes
17
Inserting Files to servers
Insertion
Deleting Files from
servers
deletion
Updating Files in servers
Update
Background Monitoring
Cloud-Based files transfer
operation
System Design(Cont.)
• Synchronization1. HDFS files synchronization
in client side
2. Check files existence
3. Cloud-Based files transfer
4. Background Monitoring
18
System Design(Cont.)
• Cloud-Based File Transfer1. Send request using socket
2. Analysis The packet (to extract the command and path)
3. Execute command (Write/Send/Delete)
1.Send request using socket
2.Analysis The packet (to extract the
command and path)
Synchronization &
Monitoring
Write a file Send a file Delete a file
3.Execute command
22
Outline
• Introduction• Background• System Design• Experiments• Conclusions• Reference
23
Experiments
• Manager Selection– 隨機配置– 負載平衡配置
• 選擇負載最小之節點: 5 個• 更新時間: 10 秒• 可同時連線數量: 10 個
• Environmental Environment 1– Servers: 50, 100, 150, 200, 250– Clients : 2000
• Environmental Environment 2– Servers: 50– Clients : 1000, 2000, 3000, 4000, 5000
24
Experiments(Cont.)
• Experiment 1– Servers: 50, 100, 150, 200, 250– Clients : 2000
25
Experiments(Cont.)
• Experiment 2– Servers: 50– Clients : 1000, 2000, 3000, 4000, 5000