Spring 2006CS 685 Network Algorithmics1 Principles in Practice CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 1

Principles in Practice

CS 685 Network AlgorithmicsSpring 2006


Scheduling for ATM VC's

scheduler

transmit

Background: • Asynchronous Transfer Mode:

virtual-circuit service• Individual circuits may be

flow-controlled for various reasons– bandwidth allocation– end-to-end flow control

• At each output port, each v.c. has its own queue of packets waiting for transmission

• A scheduler selects which packet to transmit next, according to per-v.c. state



scheduler

transmit

Background: • To be eligible for selection a

v.c. must have:– at least one "credit" (i.e. it

has not yet used up its allocation)

– at least one packet queued

• "Credits" may be accumulated by:– passage of time (for BW

allocation)– transmission from upstream

(for end-to-end flow control)



scheduler

transmit

Problem: • Too expensive to find next

ready queue via linear search, if there are 1000's of virtual circuits

• How can we avoid spending time looking at many queues that are not ready?

Hint: P12 "add state for speed"



scheduler

Solution (version 0): • Maintain a queue of ready

v.c.'s• When a flow becomes ready,

add it via tail pointer– Flow becomes ready when

• #pkts > 0, #credits 0 1• #credits > 0, #pkts 0 1

• When a flow is serviced (i.e. a packet is transmitted), remove it via head pointer– If still ready after xmission,

place it back in queue via tail

Head

Tail



scheduler

Problem: • Credits and packets arrive

asynchronously• How to ensure v.c. is not

added to the list multiple times?

• Ensure that events that may cause transition between "ready" states are "atomic"– But locking is expensive!

Head

Tail



scheduler

Solution 1:• Add a ready/not ready bit to

each v.c.'s state:– bit set means the v.c. is

already in the queue– clear bit before adding

to/removing from the queue

Head

Tail



scheduler

Problem:• Extend the scheme to allow

some v.c.'s to get "more opportunities to send", based on administratively-assigned "weights"– E.g. a v.c.'s with a weight of 2

would get twice as many opportunities as a v.c. with a weight of 1

Possible Solution:• Multiply credits by weight

when they arrive

Head

Tail


Passive Monitor Using Bridge Hardware

Background:• Ethernet bridge: box with connections to

multiple Ethernets• Keeps track of which interface a 48-bit MAC

address lives on• Forwards incoming packets out the right i/f for

the destination address• Contains special "CAM" hardware:

– Keeps track of up to 64000 (addr, i/f #) pairs– Given addr, returns i/f # in just 1.4 sec


Bridge Hardware Lookup

20:AB:0C:71:6F:A8 I/F 4

02:00:20:6A:F3:19 I/F 1

up to 64000entries

Address: 00:06:20:12:CD:BB

Address: 00:06:20:12:CD:BB I/F 6

00:06:20:12:CD:BB I/F 6



Problem:• Want to build a monitor to count the number of packets

sent between source, destination MAC address pairs• Keep a counter per (S,D) pair, where S, D are 48-bit

MAC addresses• Need to exploit the existing bridge hardware (P4c) to do

the lookup in less than 64 sec (minimum inter-packet time on 10Mbps Ethernet)

• There are about 1000 possible sources and 1000 destinations, but only around 20000 pairs active at any time



Naive Solution:• Map each 48-bit address to a smaller one (say 10 bits)

using one lookup each• Store 2-D array indexed by 10-bit quantities• Bad: array size 1000000Better:• Map each 48-bit address to a smaller one (say 24 bits)

using one lookup each• Concatenate 24-bit IDs to get a single 48-bit ID for the

connection• Retrieve counter using another hardware lookup with

that ID


Passive Monitor Solution

Src = 12:34:56:78:9A:BC

SrcID = 99:98:1B

Dest = 12:34:56:78:9A:BC

DestID = 01:22:FE

CounterID = 99:98:1B:01:22:FE

Counter = 322


Refinement I

• Set of active source-destination pairs changes over time. How to handle this without storing state for every pair that ever communicated since power-on?

• Solution:– Run a "sweeper" process in the background, which

does the following:• Mark every counter periodically

– Mark is cleared when counter is incremented• Once per period, dump marked counters to secondary

storage and reclaim them (make them available or use with new pairs)


Challenge

Can we solve the problem using only two lookups with bridge hardware, without requiring extra memory?


Tries With Node Compression

Background:• Trie: tree data structure

– Each node is an array of M = 2c elements (M < 32)– Each entry is either a key (value) or a pointer to

another trie node, or empty– Empty space is wasted

Key:

X

Key:

Q

Key:

X

5 out of 24 entries used apply P1



To search the trie:– Input string broken into c-bit "chunks"– First chunk used as index into root node's array– If the indexed entry is:

• null: search terminates (not found)• Key: search terminates (found, result = Key)• Pointer to another node: search continues there with next

chunk K

ey:

X

Key:

Q

Key:

X



Problem: how to compress the tree to remove wasted space without slowing search more than a small factor?

Can we replace the node array with a linked list?Not good: takes up to factor of M longer to search

Key:

X

Key:

Q

Key:

X



Idea:– Use a bitmap (P14) to indicate which entries are missing– Replace M-element array with array containing only non-

null entries– To search with c-bit index k (indices start at 1):

• Convert c-bit index k to smaller index– If bit k of bitmap is 0 0, else #of 1's in first k bits

Key:

X

Key:

Q

Key:

X

10

01

00

10

00

01

00

00

00

00

01

00

Original index: 4New index: 2



Efficiency:– Can count # 1's in bitmap using special hardware (or

s/w)– Requires two memory accesses:

• one for bitmap• one to read array entry

– Slowdown: factor of 2+

Key:

X

Key:

Q

Key:

X

10

01

00

10

00

01

00

00

00

00

01

00

Original index: 4New index: 2


Challenge I

Challenge: Can we use table lookup to speed up counting # of bits set in bitmap in software (P2a, P14)?

Solution (for M=8, c=3):– There are 256 possible bitmap values– Keep a table indexed by [bitmap value, position], containing

the (precomputed) number of 1 bits up to that position (3 bits per table entry)

– Total: 256 x 8 x 3 = 6144 bits probably doable in hardware

...

00000000 0 0 0 0 0 0 0 0

0 1 0 0 2 3 0 001001100... ...

...


Challenge II

Challenge: What if the bitmap is really huge? (e.g. c=16)– Table in previous solution would have 264K rows!

How can we speed up counting bits for such a large bitmap, with at most one additional memory access?

Solution: Apply P12 "add state for speed" and P12a "compute incrementally"– Logically divide bitmap into chunks (say 32 bits each)– For each node, keep an array B with one element per chunk

• For 64K bitmap in 32-bit chunks: B has 2K elements of 16 bits each = 1K 32-bit words (still much smaller than 64K entries!)

• B[j] contains # of 1 bits in positions 1 through 32(j-1)• To count 1's up through position k: B[k] + #1's in chunk k

– Count 1's in chunk k individually


Packet Classification

Background:• TCP/IP "flows" identified by some set of fields in

packet headers, including:– Source, destination IP address– Source, destination port number– Protocol (e.g. TCP or UDP)– Higher-level (application) information

• Packet "filter" = specification of fields that define a set of packets "of interest" to some receiver– Example (simple):

Source IP = X, Dest IP = *Protocol = UDPSource port = *, Dest port = Z



Background:• Consider a "router" that serves a group of

receivers– Receivers specify packets they want to receive by

giving filters– Router takes each incoming packet, matches it

against all filters, and forwards to each receiver that has some matching filter

• Example:– To receive NBC's video transmission from Winter

Olympics: specify source address of NBC host in Turin, UDP protocol, and a specific destination port



Problem:• How to compare each packet's headers against

many (possibly thousands) of filter specifications?Observation:• Caching (P11a) is not a very good solution

– Caching is good for storing pairs of the form (x, (x)) in order to avoid re-computing (or looking up) (x)

• Store is keyed by x, which is typically small (48 bits or less)– Here the key could be

How could we solve this if header contained a "flow identifier" field F in the network layer header?

Note: IPv6 has such a field



Problem: How to assign/use "flow ID" field?Solution:• Let the sender assign it!

– E.g. keep a local counter, increment for each flow

• Keep (in addition to the regular list of filters) a "fast" mapping from (source address, flow ID) to sets of receivers

• When a packet arrives, first search the "fast" list– If present, dispatch to indicated set of receivers– Otherwise search the "slow" list to determine set G of

receivers– Then add (source address, flow ID, G) to fast list


Classification: Tricky Bits

• If a source crashes and restarts flows with different flow IDs, chaos could result– Solution:

• Time out "Fast list" entries periodically• Ensure rebooting host does not re-use old entries (e.g.

start with last used flowID + 1)• Each "slow list" entry has a pointer to its "fast list" entry• When a packet fails to match any "fast list" entry, but

does match a "slow list" entry, existing fast list filter should be invalidated

• When a receiver adds a new filter, "fast list" entries may need to be updated– How to do this?

Documents

Spring 2006CS 685 Network Algorithmics1 Principles in Practice CS 685 Network Algorithmics Spring 2006