Upload
drusilla-rogers
View
214
Download
0
Embed Size (px)
Citation preview
OffloadingOffloading Multimedia Proxies Multimedia Proxies using Network Processorsusing Network Processors
A presentation by Øyvind Hvamstad19. Nov. 2004
19. nov 2004 Offloading Multimedia Proxies using Network Processors
2
Domain overview Distributed media
on demand (MoD)
Standarized protocols (RTSP/RTP)
Utilizing proxies -with Network
processing units.
Stream setup with RTSPStream transport with RTP
OUR FOCUSOUR FOCUS
IXP1200
19. nov 2004 Offloading Multimedia Proxies using Network Processors
3
Multimedia stream characteristics High data-rates
Requires much bandwith and storage space. Depending on codec, quality and length.
Soft real-time requirements Percieved quality is sensitive to jitter.
Access Patterns Zipf distribution (10% requested 90% of the
time) Newly published material tend to be popular. Consumed from start to end or quickly aborted. ”write-once-read-many”
A one hour long MPEG-2 movie at an average bit-rateOf 3.5 Mbps takes up 1.6 GBworth of storage. DivX canreduce this by a factor of 6.5.Thus reducing the size to 246MB
19. nov 2004 Offloading Multimedia Proxies using Network Processors
4
The MoD Proxy
Deployed in client vicinity to: Reduce client startup latency Reduce server load Reduce network load
Must face the challenges of: Many concurrent clients Possible high aggregate network load CPU intensive tasks
19. nov 2004 Offloading Multimedia Proxies using Network Processors
5
Caching Protocol translation Transcoding Re-encoding Encryption General functions
Access control QoS mechanisms Traffic engeneering
MoD proxy tasks
OUR FOCUSOUR FOCUS
Forwarding data and requests
19. nov 2004 Offloading Multimedia Proxies using Network Processors
8
Exactly what? Offload application layer packet forwarding.
Free cycles on the host CPU for other tasks. Show improvements compared to a
traditional architecture. Reduced latency
Provide a basis for future work in the area. Extensions
Caching Zero copy
19. nov 2004 Offloading Multimedia Proxies using Network Processors
9
Design
RTSPclient
RTSPserver
RTPserver
RTPclient
Cachecontrol
SessionMgmt.
Cacher
control-planedata-plane
write_through(async)fetch()
fast_forward()
insert()remove()lookup()
insert()remove()lookup()
signal()
lookup()
Fromserver
To/fromserver
Toclient
To/fromclient
lookup()
19. nov 2004 Offloading Multimedia Proxies using Network Processors
10
Prototype implementation
RTSPProxy
Linux run-timeIXA run-time
StrongARMMicroengines
IngresscoreACE
IngressmicroACE
Classifer/RTP-fwd
microACE
EgressmicroACE
Classifer/RTP-fwdcoreACE
EgresscoreACE
StackACE
µe 0 µe 1
control-planedata-plane
Intel ACEs
19. nov 2004 Offloading Multimedia Proxies using Network Processors
11
Experiments Measure the
processing overhead during RTP-forwarding.
Cycle precision Probes at different
locations in the code.
Minimal probe overhead.
Switch
Dell GX260
Darwinstreaming
serverrclient.py
Proxy
eth0
IXP1200
IngressmicroACE
ProcessmicroACE
EgressmicroACE
Probe Probe
19. nov 2004 Offloading Multimedia Proxies using Network Processors
12
Results
Offloading effect 100% of all network traffic processed by
the StrongARM and the microengines. Prototype performance
Processes every RTP packet using about a tenth of the cycles compared to a traditional architecture. (Delay reduced from ~80 µs to ~8 µs @ 232 Mhz)
19. nov 2004 Offloading Multimedia Proxies using Network Processors
13
Extensions Write-through caching
Make a multicaster by copying packets Use a lazy-copy strategy to reduce copy
operations pr. packet. Send payload copy to the host.
Zero-copy-path Batched transfer to the host. Scatter-gather DMA to assemble packet
payloads in host memory. Large disk-requests.
19. nov 2004 Offloading Multimedia Proxies using Network Processors
14
Conclusion
Prototype relevance Data will always flow through the proxy The forwarder processes an RTP packet
efficiently compared to a traditional architecture.
Is thus an orthogonal way to improve MoD proxies.
Low resource utilization leaves room for extensions.
19. nov 2004 Offloading Multimedia Proxies using Network Processors
15
Other applications The idea might be more applicable in other,
more real-time areas. Online games
Proxy holds game state Might also need to forward real-time data while
playing. Live voice communication Other urgent game data
Video conferences Node that handles overlay multicasting. Real-time data forwarded with low latency.
19. nov 2004 Offloading Multimedia Proxies using Network Processors
16
Linear modulo operator
#macro Mod[out_z, in_x, in_y].local xm
alu[xm, --, B, in_x]Loop#:
alu[out_z, xm, -, in_y]br<0[End#] alu[xm, --, B, out_z] alu[--, xm, -, in_y]br>=0[Loop#]
End#:alu[out_z, --, B, xm]
.endlocal#endm
int mod(int x, int y) { int z;top: z = y – x; if (z < 0) z = x; x = z; if ((x - y) > 0) goto top return z;}
Intel Assembler macro ANSI C function
19. nov 2004 Offloading Multimedia Proxies using Network Processors
17
Basic hash function
#macro Hash[out_z, in_x, in_seed] .local x start
immed[start,START]alu[x, in_x, -, start] ;
alu_shf[x, --, B, x, >>1]Mod[out_z, x, in_seed]
.endlocal#endm
int hash(int x, int y) { x = x – START; x = x / 2; return x % seed;}
Intel Assembler macro ANSI C function
19. nov 2004 Offloading Multimedia Proxies using Network Processors
18
Incremental checksumming#macro IncrementCksum[out_newsum, oldsum, old, new].local sum tmp mask
immed[mask, 0xffff]alu[sum, --, ~B, oldsum] alu[sum, sum, -, old] alu[sum, sum, +, new] alu[tmp, sum, AND, mask]alu_shf[sum, tmp, +, sum, >>16] alu[tmp, sum, AND, mask] alu_shf[sum, tmp, +, sum, >>16] alu[sum, --, ~B, sum]alu[out_newsum, sum, AND, mask]
.endlocal#endm
sum = ~oldsum;sum = sum - old;sum = sum + new;tmp = sum & 0xffffsum = tmp + (sum >> 16);tmp = sum & 0xffff;sum = ~sum; sum = tmp + (sum >> 16);