Upload
vincent-dean
View
223
Download
2
Tags:
Embed Size (px)
Citation preview
Double Middleware-based Mobile Data Service
Jinsuo Zhang
CISE DepartmentUniversity of Florida
April 2002
Outline
• Introduction
• Double Middleware-based Architecture
• Automatic Data Hoarding
• Data Management & Consistency Control
• Heterogeneity Support
• Mobile Network Adaptation
• Summary & Future Work
History of Data Management
Monitor File System Network
File SystemMobile File
Service
Location Transparency
Name Transparency
Hierarchy
Name Transparency ?
Current Computation Model
InterNet
Challenges
Data AvailabilityAutomatic Hoarding
What, When, Where, Who, Why, How
Data ConsistencyHeterogeneous
CommunicationData Content
Mobile Network Adaptation
Motivation• Data Access From Anywhere at Any Time
– Data access in disconnection, weak connection and strong connection mode
• Device Independence– laptop, PDA,…
• Heterogeneity Support– Communication– Data content
Contribution• A New Architecture & Implementation• Filtering Mechanism & a New Hybrid Priority-
based Algorithm• XML-based Protocol• Asynchronous Consistency Model• Network Log Optimization• Incremental-based Weak Connection Adaptation• Simulation-based & Live Experiment Validation
Trace Collecting Period Footprint (Number of Files)
Trace 1 63 days 8k
Trace 2 62 days 8k
Trace 3 232 days 37k
Trace 4 132 days 100k
Trace 5 61 days 4k
Trace Information
Outline
• Introduction• Double Middleware-based Architecture• Automatic Data Hoarding• Data Management & Consistency Control• Heterogeneity Support• Mobile Network Adaptation• Summary & Future Work
Double Middleware-based Architecture
Inter-Net
F-MEM
r
r
Meta-Data
Server
Data Server
AbiWord vi
Operating System (Linux)
MS Word Visual Studio
Operating System (Windows)
M-MEM
M-MEM
Mobile Data Normal Data
Managed by both Host FS and MFS Managed only by Host FS
Retire
Role of Data
Publish/Import
Data Propagation
MH 1 MH 3 MH 4
MDSS
…… ……
Publish
MH 2
Import
Synchronization
Mobile Client Model
Logic Layer of File System
Physical Layer of File System
Application
User Space
Kernel Space
M-MEM
MFS Extension
MFS Utility
AnalyzerCache
ManagerData
Synchronizer…
M-MEM Responsibility
• Observe File Access Pattern• Decide Active Files On-the-Fly• Publish/Import files• Cache Management• Data Synchronization• Heterogeneity Support• Mobile Network Adaptation
M-MEM Workflow
Analyzer
MD Agent
File Event
Cache Manager
Filtering
Event Queue
Piggybacked Message
Write
Synchronizer
Non-WriteEvent
Optimizer
HTTP/SMTP/POP3
Scheduler
AdaptorNetwork Adaptor
Messenger
Input Event Queue
XML Protocol Parser
Output Event Queue
Request Interpreter
Authentication
Input Decoder
F-MEM
Data Server
Meta-Data Server
F-MEM Architecture
Output Packager
Request
Reply
System Summary
Dimension Our solution
Mobile Data Spectrum Any File in Supported OS
Data Selection Automatic
Application Transparency Transparent
Conservative/Optimistic Optimistic
Client/Server or Peer-to-Peer Hybrid
Immediate/delayed propagation Delayed
Push/Pull Model Hybrid Push/Pull
Replication Granularity Per-File
Replacement Policy Hybrid Priority
Updated Data Shipping Incremental Update
Outline
• Introduction
• Double Middleware-based Architecture
• Automatic Data Hoarding
• Data Management & Consistency Control
• Heterogeneity Support
• Mobile Network Adaptation
• Summary & Future Work
File Selection Workflow
Hooks in OS Kernel
User’s Profile in Server
Hoarding List
FilterFile access event AnalyzerFiltered event
Mobile Computer
Hoard Instruction
Hooks on Operating System
• Linux– Modified Linux Kernel– Introduce a Pseudo char-Driver as Bridge between
Kernel & Analyzer.
• Windows (95,98,ME,NT & 2000)– Introduce a Filter Device Driver to Intercept File
Access Event (DDK)– Forward Event to User-space Analyzer
• CE.Net by Ajay• Real World Problem
– Translation– Kernel-Application Communication
File Selection Workflow
Hooks in OS Kernel
User’s Profile in Server
Hoarding List
FilterFile access event AnalyzerFiltered event
Mobile Computer
Hoard Instruction
Location Distribution
0%
20%
40%
60%
80%
100%
System/Software Package Files Other Files
Trace 1 Trace 2 Trace 3 Trace 4 Trace 5 Average
Program Access Distribution
0
20
40
60
80
100
1 2 3 4 5 6 7 8 9 10
Number of Program (ordered by number of accesses)
Per
cent
age
of F
ile A
cces
s (%
)
Trace 1 Trace 2 Trace 3 Trace 4 Trace 5 Average
Filtering Mechanism• Filter Types:
– Path Based Filter• /tmp,/etc,/usr/bin,/usr/lib,/dev,$HOME/.pine, …• /WinNT, /Program files
– Program Base Filter• Find, daemon, service task, virus tool,……
– Extension Name Based Filter• *.bak, *.tmp, *.old, *.swp, …
– File Type Based Filter• Pipe, device
– Time Based Filter• Backup, scan virus
– Derive Based Filter• *.c-> *.o, *.tex->*.dvi,*.ps
– Meta-info Based Filter• Size, Date, Permission, Ownership……
Trace Interpreter
MFS Extension
Operating System
Kernel Space
M-MEM Utilities
X Disabled
Simulation Methodology
User Space
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Non-Filtered
Filtered Out
Trace 1 Trace 2 Trace 3 Trace 4 Trace 5 Average
Effectiveness of Filters
*Left bar: Unique Files. Right bar: Access
0
50
100
150
200
250
300
350
400
0 10 20 30 40 50 60 70
Day
Num
ber
of U
niqu
e Fi
les
W/o FilterW/ Filter
Daily Working Set for Trace 1
0
50
100
150
200
250
300
350
0 10 20 30 40 50 60 70
Day
Num
ber
of U
niq
ue F
iles
w/o Filterw/ Filter
Daily Working Set for Trace 2
0
200
400
600
800
1000
1200
1400
0 50 100 150 200 250
Day
Num
ber
of U
niqu
e Fi
les
w/o Filter w/ Filter
Daily Working Set for Trace 3
0
500
1000
1500
2000
2500
3000
3500
4000
0 20 40 60 80 100 120 140
w/o Filterw/ Filter
Daily Working Set for Trace 4
0
50
100
150
200
250
300
350
400
0 10 20 30 40 50 60 70
Day
Num
ber
of U
niqu
e Fi
les
w/o Filterw/ Filter
Daily Working Set for Trace 5
File Selection Workflow
Hooks in OS Kernel
User’s Profile in Server
Hoarding List
FilterFile access event AnalyzerFiltered event
Mobile Computer
Hoard Instruction
Observations (1 of 2)
0%
20%
40%
60%
80%
100%
Per
cen
tag
e o
f U
niq
ue
File
s
<= 1min <= 10min <= 1 hour <= 1 day >1day
Trace 1 Trace 2 Trace 3 Trace 4 Trace 5 Average
Left bar: Non-filtered, Right bar: Filtered
0%
10%20%
30%40%
50%
60%70%
80%90%
100%
Pe
rce
nta
ge
of
Un
iqu
e F
iles
Freq=1 Freq=2 Freq=3 Freq=4 Freq>=5
Trace 1 Trace 2 Trace 3 Trace 4 Trace 5 Average
Observations (2 of 2)
Left bar: Non-filtered, Right bar: Filtered
Hybrid Priority-based Algorithm
F (t, f, a) = 1 F1(t) + 2 F2(f) + 3 F3(a)
• F1(<current time>) = H0 – <aging parameter>
* (<current time> – last access time)
• F2(freq) = FA * freq
• F3(<active period>) = AA * <active period>
> H0 ? H0 : AA * <active period>
Algorithm Analysis
• Property 1: If F (t, f1, a1) > F (t, f2, a2), then for >
t, F (, f1, a1) > F (, f2, a2)
Property 1 means if two files are ordered by the hybrid priority and no file is touched since ordering, then the order is always kept.
• Property 2: The file list ordered by the hybrid priority need not be updated between any two file access intervals.
• Property 3: If one file in the file list is accessed, the order after this file in the file list still holds.
Hybrid Priority-based Algorithm
1. Records history information, reset when newly inserted.2. Dynamic just-in-time re-computing3. Binary insertion, complexity O(log(n))
1 2 3 4 5 6 7 8 9 10
Header Tail (existing file) (example) Tail (new file)
Bigger HP Smaller HP
10
20
30
40
50
60
70
80
20 30 40 50 60
Cache Size (number of files)
Hit
Ra
tio
(%
)
OPTHPLRU
Hit Ratio for Trace 1
35
40
45
50
55
60
65
70
20 30 40 50 60
Cache Size (number of files)
Hit
Rat
io (%
)
OPTHPLRU
Hit Ratio for Trace 3
20
30
40
50
60
70
20 30 40 50 60
Cache Size (number of files)
Hit R
atio
(%)
OPTHPLRU
Hit Ratio for Trace 4
Outline
• Introduction
• Double Middleware-based Architecture
• Automatic Data Hoarding
• Data Management & Consistency Control
• Heterogeneity Support
• Mobile Network Adaptation
• Summary & Future Work
Data Management• Distributed Management
– Naming:URI– Version Control– Timestamp
• Data Spectrum• Role Change• Automatic Selection, Publish, Importing,
Consistency Maintenance
Data Representation in MDSS
Meta-Data Server (Apache Xindice)
Data Server
Mobile User
URI
Mobile Profile
URI1URIVersion
URI2
URI
Version
Replica Descriptor
Data 1
Data 2
Data
……
……
……
2. Update Propagation
1. Update Detection
4. Piggybacked Notification of Data Staleness
5. Refresh Request
6. Delta Data
Consistency Model
3. Buffer Notification
MDSS
Outline• Introduction
• Double Middleware-based Architecture
• Automatic Data Hoarding
• Data Management & Consistency Control
• Heterogeneity Support
• Mobile Network Adaptation
• Summary & Future Work
Comm. Between M-MEM and F-MEM
Network
Message Transport Agent
HTTP/SMTP/POP3
XML-based Protocol
Mobile Message Adaptor
XML Command
Message Transport Agent
XML-based Protocol
Mobile Message Adaptor
• Message-based,Asynchronous, Durable, Reliable• XML-based• File granularity, incremental based
Optimized XML Command
XML Communication Protocol
• File Management– Publish
– Import
– Retire
– Delete
– Rename
• User Management– PublicKeyReq– Login– Logout– Profile
• Consistency Maintenance– Update– Refresh
A Publish Example<?xml version="1.0"?><MobileDataMessage>
<OPERATOR>PUBLISH
</OPERATOR><FileInfo>
<URI> mfs://mymachine/mypath/myname.dat </URI><MODIFYTIME> 12:34:56 01/02/2000 </MODIFYTIME><SIZE> 12345 </SIZE><ACCESS>
<OWNER> somebody </OWNER><GROUP> somegroup </GROUP><ACCESSATTRIBUTE>rwxrwxrwx</ACCESSATTRIBUTE>
<ACCESS><VERSION> 1 </VERSION>
<FileInfo><AGENTINFO>
<ID>M-MEM 1.0</ID><HOSTOS> Linux </HOSTOS>
</AGENTINFO></MobileDataMessage>
Outline
• Introduction
• Double Middleware-based Architecture
• Automatic Data Hoarding
• Data Management & Consistency Control
• Heterogeneity Support
• Mobile Network Adaptation
• Summary & Future Work
Mobile Network Adaptation
• Hoarding
• Filtering Mechanism
• Working Set Locality
• Log Optimization
• Incremental Update/Hoarding
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100 120 140 160 180 200 220
Day
Rela
tive D
aily W
ork
ing
Set
Overl
ap
vs. P
revio
us D
ay (
%)
Locality for Trace 3
Locality for Trace 4
0
10
20
30
40
5060
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100 110 120 130
Day
Rel
ativ
e D
aily
Wo
rkin
g S
et
Ove
rlap
vs.
Pre
vio
us
Day
(%
)
Trembling Phenomenon
Publish APublish BRetire APublish XRetire BPublish YRetire XPublish ARetire YPublish BRetire APublish URetire BPublish VRetire UPublish ARetire VPublish B
Publish APublish B
File Access Sequence: ABXYABUVAB
Publish APublish BPublish XPublish YRetire XPublish URetire YPublish V
Publish window size =4 Publish window size =2
0
1000
2000
3000
4000
5000
6000
7000
20 100 200 300 400 500 600 700
Publish Window Size (# of files)
Num
ber o
f Pub
lish
Req
uest
s
Trembling for Trace 3
Trembling for Trace 4
5000
7000
9000
11000
13000
15000
17000
19000
21000
23000
25000
20 100 200 300 400 500 600 700
Publish Window Size (number of files)
Nu
mb
er o
f P
ub
lish
Req
ues
t
0102030405060708090
100
0 60 120 180
File Access Interval (Seconds)
Acc
umul
ativ
e Pe
rcen
tage
of I
nter
-re
fere
nce
Inte
rval
(%)
Trace 2Trace 3
Trace 5
Trace 1
Trace 4
Inter Access Interval
0
1000
2000
3000
4000
5000
6000
7000
20 30 40 50 60 70 80 90 100
Publish Window Size (number of files)
Num
ber
of P
ublis
h R
eque
sts
No Delay
Delay 3min
Log Optimization for trace 3
0
5000
10000
15000
20000
25000
20 30 40 50 60 70 80 90 100
Publish Window Size
Nu
mb
er
of
Pu
bli
sh
Re
qu
es
ts Nodelay
Dealy 3min
Log Optimization for trace 4
Incremental Update/Hoarding Design
Method
F-MEM
Versioned File Manager
Server End
M-MEM
Versioned File Manager
Mobile Host
Client FS Network
1. Version control2. Versioned File Archive
M-MEM + NistNETFMEM + NistNET D-Link Router
Experiment Environment
Workload
Redhat 7.1 Distribution RedHat 7.2 Distribution
Version Size (Bytes)
Number of files
Version Size (Bytes)
Number of files
Apache 1.3.19 8.9M 752 1.3.20 9.3M 773
Bash 2.04 8.2M 731 2.05 8.5M 761
Glade 0.5.9 8.9M 549 0.6.2 10.8M 598
Groff 1.16.1 7.7M 668 1.17.2 7.9M 654
GNU Spell 32.6 6.3M 446 33.7 7.4M 507
Workload Introduction
0
5
10
15
20
25
30
35
Apache Bash Glade Groff Spell AvgPer
cen
tag
e o
f Net
wo
rk T
raff
ic
Rel
ativ
e to
Val
ue
Sh
ipp
ing
Network Traffic
300
600
900
1200
1500
4KB 5KB 6KB 7KB 8KB 9KB 10KB
Bandwidth (1KB=1024 Bytes)
ReI
nte
gra
tio
n T
ime
(Sec
on
ds)
Value ShippingIncremental
Apache
Bash
300
600
900
1200
1500
4KB 5KB 6KB 7KB 8KB 9KB 10KB
Bandwidth (1KB=1024 Bytes)
ReI
nte
gra
tio
n T
ime
(Sec
on
ds)
Value ShippingIncremental
600
900
1200
1500
1800
2100
2400
4KB 5KB 6KB 7KB 8KB 9KB 10KB
Bandwidth (1KB=1024 Bytes)
ReI
nte
gra
tio
n T
ime
(Sec
on
ds)
Value ShippingIncremental
Glade
Groff
300
600
900
1200
1500
4KB 5KB 6KB 7KB 8KB 9KB 10KB
Bandwidth (1KB=1024 Bytes)
ReI
nte
gra
tio
n T
ime
(Sec
on
ds)
Value ShippingIncremental
GNU Spell
600
900
1200
1500
1800
2100
4KB 5KB 6KB 7KB 8KB 9KB 10KB
Bandwidth (1KB=1024 Bytes)
ReI
nte
gra
tio
n T
ime
(Sec
on
ds)
Value ShippingIncremental
Outline• Introduction
• Double Middleware-based Architecture
• Automatic Data Hoarding
• Data Management & Consistency
• Heterogeneity Support
• Mobile Network Adaptation
• Summary
Summary• A New Architecture & Implementation
– Linux & Windows
– Xindice Native XML DB
– Libxml XML Parser
– C/C++/STL, ~20K Lines
• Filtering Mechanism & a New Hybrid Priority-based Algorithm
• XML-based Protocol
• Asynchronous Consistency Model
• Network Log Optimization
• Incremental-based Weak Connection Adaptation
• Simulation-based & Live Experiment Validation
Questions
?