17
Offloading Offloading Multimedia Multimedia Proxies using Network Proxies using Network Processors Processors A presentation by Øyvind Hvamstad 19. Nov. 2004

Offloading Multimedia Proxies using Network Processors A presentation by Øyvind Hvamstad 19. Nov. 2004

Embed Size (px)

Citation preview

OffloadingOffloading Multimedia Proxies Multimedia Proxies using Network Processorsusing Network Processors

A presentation by Øyvind Hvamstad19. Nov. 2004

19. nov 2004 Offloading Multimedia Proxies using Network Processors

2

Domain overview Distributed media

on demand (MoD)

Standarized protocols (RTSP/RTP)

Utilizing proxies -with Network

processing units.

Stream setup with RTSPStream transport with RTP

OUR FOCUSOUR FOCUS

IXP1200

19. nov 2004 Offloading Multimedia Proxies using Network Processors

3

Multimedia stream characteristics High data-rates

Requires much bandwith and storage space. Depending on codec, quality and length.

Soft real-time requirements Percieved quality is sensitive to jitter.

Access Patterns Zipf distribution (10% requested 90% of the

time) Newly published material tend to be popular. Consumed from start to end or quickly aborted. ”write-once-read-many”

A one hour long MPEG-2 movie at an average bit-rateOf 3.5 Mbps takes up 1.6 GBworth of storage. DivX canreduce this by a factor of 6.5.Thus reducing the size to 246MB

19. nov 2004 Offloading Multimedia Proxies using Network Processors

4

The MoD Proxy

Deployed in client vicinity to: Reduce client startup latency Reduce server load Reduce network load

Must face the challenges of: Many concurrent clients Possible high aggregate network load CPU intensive tasks

19. nov 2004 Offloading Multimedia Proxies using Network Processors

5

Caching Protocol translation Transcoding Re-encoding Encryption General functions

Access control QoS mechanisms Traffic engeneering

MoD proxy tasks

OUR FOCUSOUR FOCUS

Forwarding data and requests

19. nov 2004 Offloading Multimedia Proxies using Network Processors

8

Exactly what? Offload application layer packet forwarding.

Free cycles on the host CPU for other tasks. Show improvements compared to a

traditional architecture. Reduced latency

Provide a basis for future work in the area. Extensions

Caching Zero copy

19. nov 2004 Offloading Multimedia Proxies using Network Processors

9

Design

RTSPclient

RTSPserver

RTPserver

RTPclient

Cachecontrol

SessionMgmt.

Cacher

control-planedata-plane

write_through(async)fetch()

fast_forward()

insert()remove()lookup()

insert()remove()lookup()

signal()

lookup()

Fromserver

To/fromserver

Toclient

To/fromclient

lookup()

19. nov 2004 Offloading Multimedia Proxies using Network Processors

10

Prototype implementation

RTSPProxy

Linux run-timeIXA run-time

StrongARMMicroengines

IngresscoreACE

IngressmicroACE

Classifer/RTP-fwd

microACE

EgressmicroACE

Classifer/RTP-fwdcoreACE

EgresscoreACE

StackACE

µe 0 µe 1

control-planedata-plane

Intel ACEs

19. nov 2004 Offloading Multimedia Proxies using Network Processors

11

Experiments Measure the

processing overhead during RTP-forwarding.

Cycle precision Probes at different

locations in the code.

Minimal probe overhead.

Switch

Dell GX260

Darwinstreaming

serverrclient.py

Proxy

eth0

IXP1200

IngressmicroACE

ProcessmicroACE

EgressmicroACE

Probe Probe

19. nov 2004 Offloading Multimedia Proxies using Network Processors

12

Results

Offloading effect 100% of all network traffic processed by

the StrongARM and the microengines. Prototype performance

Processes every RTP packet using about a tenth of the cycles compared to a traditional architecture. (Delay reduced from ~80 µs to ~8 µs @ 232 Mhz)

19. nov 2004 Offloading Multimedia Proxies using Network Processors

13

Extensions Write-through caching

Make a multicaster by copying packets Use a lazy-copy strategy to reduce copy

operations pr. packet. Send payload copy to the host.

Zero-copy-path Batched transfer to the host. Scatter-gather DMA to assemble packet

payloads in host memory. Large disk-requests.

19. nov 2004 Offloading Multimedia Proxies using Network Processors

14

Conclusion

Prototype relevance Data will always flow through the proxy The forwarder processes an RTP packet

efficiently compared to a traditional architecture.

Is thus an orthogonal way to improve MoD proxies.

Low resource utilization leaves room for extensions.

19. nov 2004 Offloading Multimedia Proxies using Network Processors

15

Other applications The idea might be more applicable in other,

more real-time areas. Online games

Proxy holds game state Might also need to forward real-time data while

playing. Live voice communication Other urgent game data

Video conferences Node that handles overlay multicasting. Real-time data forwarded with low latency.

19. nov 2004 Offloading Multimedia Proxies using Network Processors

16

Linear modulo operator

#macro Mod[out_z, in_x, in_y].local xm

alu[xm, --, B, in_x]Loop#:

alu[out_z, xm, -, in_y]br<0[End#] alu[xm, --, B, out_z] alu[--, xm, -, in_y]br>=0[Loop#]

End#:alu[out_z, --, B, xm]

.endlocal#endm

int mod(int x, int y) { int z;top: z = y – x; if (z < 0) z = x; x = z; if ((x - y) > 0) goto top return z;}

Intel Assembler macro ANSI C function

19. nov 2004 Offloading Multimedia Proxies using Network Processors

17

Basic hash function

#macro Hash[out_z, in_x, in_seed] .local x start

immed[start,START]alu[x, in_x, -, start] ;

alu_shf[x, --, B, x, >>1]Mod[out_z, x, in_seed]

.endlocal#endm

int hash(int x, int y) { x = x – START; x = x / 2; return x % seed;}

Intel Assembler macro ANSI C function

19. nov 2004 Offloading Multimedia Proxies using Network Processors

18

Incremental checksumming#macro IncrementCksum[out_newsum, oldsum, old, new].local sum tmp mask

immed[mask, 0xffff]alu[sum, --, ~B, oldsum] alu[sum, sum, -, old] alu[sum, sum, +, new] alu[tmp, sum, AND, mask]alu_shf[sum, tmp, +, sum, >>16] alu[tmp, sum, AND, mask] alu_shf[sum, tmp, +, sum, >>16] alu[sum, --, ~B, sum]alu[out_newsum, sum, AND, mask]

.endlocal#endm

sum = ~oldsum;sum = sum - old;sum = sum + new;tmp = sum & 0xffffsum = tmp + (sum >> 16);tmp = sum & 0xffff;sum = ~sum; sum = tmp + (sum >> 16);

19. nov 2004 Offloading Multimedia Proxies using Network Processors

19

Just can’t get enough, huh?

/hom/~oyvindh/thesis.pdf CVS repository

:pserver:hic.no/cvs Module: thesis User: anonymous Pwd: <empty>

Should be up in a few days Have just moved to a new location