Click here to load reader
Upload
dana-dalton
View
215
Download
1
Embed Size (px)
Citation preview
John [email protected]
http://www.arl.wustl.edu/projects/techX
ONL NP Router Block Design Review:
Lookup (Part of the PLC Block)
2 - John DeHart - 04/19/23
Revision History
4/12/07 (JDD):»Started
3 - John DeHart - 04/19/23
SRAM
ONL NP Router
Rx(2 ME)
HdrFmt(1 ME)
Parse, Lookup,
Copy(3 MEs)
TCAM SRAM
Mux(1 ME)
Tx(1 ME)
QM(1 ME)
xScale xScale
xScale
Assoc. DataZBT-SRAM
Plu
gin
0
Plu
gin
1
Plu
gin
2
Plu
gin
3
Plu
gin
4NN NN NN NN
FreeList Mgr(1 ME)
Tx, QMParsePluginXScale
Stats(1 ME)
QMCopyPlugins SRAM
NN
SRAMRing
ScratchRing
NNRingNN
SRAM
64KW
64KW64KWEach
New
NeedsA Lot
Of Mod.
NeedsSomeMod.
MostlyUnchanged
4 - John DeHart - 04/19/23
Contents Overview Design Latency Analysis Code Locations Test Procedures Implementation Status
5 - John DeHart - 04/19/23
Overview Initialization
» Control Plane initializes TCAM and Route and Filter DBs Runtime Updates
» Control Plane updates to Route and Filter DBs Design – in upcoming slides Processing – in upcoming slides Lookup will be written in C
» There are many things about writing IXP code in “C” that I need to learn. Here are some of them:
Performing multiple memory operations in parallel and waiting on a set of signals (If needed for performance reasons)
Performing timestamp waits Calling IDT microcode macros
6 - John DeHart - 04/19/23
Lookup: Design -- Databases Three Databases:
» Route Lookup: Unicast
Sorted by DAddr Prefix Length Multicast
Exact match on DAddr and prefix of SAddr» Primary Filter
Filters should be sorted in the DB with higher priority filters first» Auxiliary Filter
Filters should be sorted in the DB with higher priority filters first Priority between Primary Filter and Route Lookup
» A priority will be stored with each Primary Filter» A priority will be assigned to RLs (all routes have same priority)» PF priority and RL priority compared after result is retrieved.
One of them will be selected based on this priority comparison. Auxiliary Filters:
» If matched, cause a copy of packet to be sent out according to the Aux Filter’s result.
7 - John DeHart - 04/19/23
Lookup: Design -- Results Use SRAM Bank 0 (2 MB per NPU) for Results
»B0 Byte Address Range: 0x000000 – 0x1FFFFF 21 bits
»B0 Word Address Range: 0x000000 – 0x1FFFFC 19 significant bits 2 trailing 0’s
Store result in two parts:»32-bit Associated Data SRAM result for Address of actual Result:
TCAM Control Bits (3b) Done: 1b Hit: 1b MHit: 1b
Priority: 8b Present for Primary Filters, for RL and Aux Filters should be 0
SRAM B0 Word Address: 21b 2 spare bitS if needed for anything else
»3 Words (<= 96 bits) of Result in SRAM Bank0 Use Multi-Database Lookup (MDL) Indirect for searching all 3 DBs
»Order of fields in Key is important. Each thread will need one TCAM context
8 - John DeHart - 04/19/23
Lookup Processingwrite KEY to TCAMuse timestamp delay to wait appropriate time
make delay long enough that we are as sure as possible that we willhave to read the 1st word of the Results MB only once
while !DoneBit // DONE Bit BUG Fix requires reading just first wordread 1 word from Results Mailbox and check DoneBit
doneread words 2 and 3 from Results MailboxIf (PrimaryFilter AND RouteLookup results HIT) {
PrimaryResult.Valid TRUEcompare prioritiesstore higher priority result as Primary Result (read result from SRAM Bank0)
} else if (PrimaryFilter results HIT) {PrimaryResult.Valid TRUEPrimaryResults.* PrimaryFilter.* (read result from SRAM Bank0)
} else if (RouterLookup results HIT) {PrimaryResult.Valid TRUEPrimaryResults.* RouteLookup.* (read result from SRAM Bank0)
} else PrimaryResult.Valid Falseif (AuxiliaryFilter result HIT) {
AuxiliaryResult.Valid TRUEAuxiliaryResults.* (read result from SRAM Bank0)
} else AuxiliaryResult.Valid FALSE
9 - John DeHart - 04/19/23
Lookup Key and Results FormatsIP DAddr (32b) IP SAddr (32b) DPort (16b)SPort (16b)
Proto(8b)
TCPFlags(12b)
Exceptions(16b)
P(3b)
PTag(5b)
QID (16b)Stats Index
(16b)
UCastMCast(12b)
V(4b)
Prio(8b)
D(1b)
H(1b)
MH
(1b)
Address (21b)
32 Bit Result in TCAM Assoc. Data SRAM:
96 Bit Result in QDR SRAM Bank0:
PF
QID (16b)Stats Index
(16b)
UniCast(8b)
V(4b)
Res(8b)
D(1b)
H(1b)
MH
(1b)
Address (21b)AF
QID (16b)Stats Index
(16b)
UCastMCast(12b)
V(4b)
Res(8b)
D(1b)
H(1b)
MH
(1b)
Address (21b)RL
SB
(2b)
EntryValid(1b)
NHIP
Valid(1b)
NHMACValid(1b)
IPMC
Valid(1b)
NH_MAC (48b)
NH_IP (32b)Res (16b)
NH_MAC (48b)
NH_IP (32b)Res (16b)
NH_MAC (48b)
NH_IP (32b)Res (16b)
Multicast Copy Vector (11b)PPS(1b)
•If IP MC Valid = 1
D(1b)
PPS(1b)
UCast Out Port
(3b)
UCast Out Plugin
(3b)
Reserved(4b)
•If IP MC Valid = 0
140 Bit Key:
RL
PF and AF
TCAM CtrlBits:
D:DoneH:HIT
MH:Multi-Hit
Res
(2b)
10 - John DeHart - 04/19/23
Exception Bits in Lookup KeyIP DAddr (32b) IP SAddr (32b) DPort (16b)SPort (16b)
Proto(8b)
TCPFlags(12b)
Exceptions(16b)
P(3b)
PTag(5b)
Non-IP(1b)
140 Bit Key:
RL
PF and AF
ARP(1b)
IP Opt(1b)
TTL(1b)Reserved (12b)
Exception Bits:»TTL: TTL has expired. It was 0 or 1 on arriving packet»IP Opt: IP Packet contained Options»ARP: Ethertype field in ethernet header was ARP»Non-IP: Ethertype field in ethernet header was NOT IP
NOTE: An ARP packet will have ARP bit and Non-IP bit set
11 - John DeHart - 04/19/23
Performance What is our performance target?
»To hit 5 Gb rate: Minimum Ethernet frame: 76B
64B frame + 12B InterFrame Spacing 5 Gb/sec * 1B/8b * packet/76B = 8.22 Mpkt/sec
» IXP ME processing: 1.4Ghz clock rate 1.4Gcycle/sec * 1 sec/ 8.22 Mp = 170.3 cycles per packet compute budget: (MEs*170)
1 ME: 170 cycles 2 ME: 340 cycles 3 ME: 510 cycles 4 ME: 680 cycles
latency budget: (threads*170) 1 ME: 8 threads: 1360 cycles 2 ME: 16 threads: 2720 cycles 3 ME: 24 threads: 4080 cycles 4 ME: 32 threads: 5440 cycles
slide taken from ONL_NProuter.ppt
12 - John DeHart - 04/19/23
Lookup Block Diagram
Setup Lookup Key
Write Lookup Key to TCAM
TimeStamp Delay
Read 1W Result from AD
SRAM Write: 5W
SRAM Read: 1W 150 cycles
mem access Latency
Check Done Bit
ctx_swap
ctx_swap
Read 2W Result from AD SRAM Read: 2W 150 cyclesctx_swap
Read 2 Full Results from QDR SRAM Read: 3W 150 cyclesctx_swap SRAM Read: 3W 150 cycles
Setup Results for Copy
315 cycles
TOTAL (No optimization) 915 cycles
13 - John DeHart - 04/19/23
File locations (in …/ONL_Router/) Code
» src/applications/ONL_Router/src/plc/ONL/lookup.c
Include Paths» src/applications/ONL_Router/src/dispatch_loop/ONL/
dl_source.h and dl_source.c dl_source() and dl_sink() functions
» src/IDT_NSE/data_place_IXP2XXX/include IDT IIPC defines and macros
» others?
14 - John DeHart - 04/19/23
Test and Validation
15 - John DeHart - 04/19/23
Implementation Status Still in pseudo-code Bugs Untested Optimizations:
16 - John DeHart - 04/19/23
Extra Slides The rest of the slides are either for extra support
information or are old and will be deleted when I am convinced they are no longer needed
17 - John DeHart - 04/19/23
Route Lookup Route Lookup Key (72b)
» Port (3b): Can be a wildcard (for Unicast, probably not for Multicast) Value of 111b in Port field can be used to denote a packet that originated from the XScale Value of 110b in Port field can be used to denots a packet that originated from a Plugin Ports numbered 0-4
» PluginTag (5b): Can be a wildcard (for Unicast, probably not for Multicast) Plugins numberd 0-4
» DAddr (32b) Prefixed for Unicast Exact Match for Multicast
» SAddr (32b) Unicast entries always have this and its mask set to 0 Prefixed for Multicast
Route Lookup: Result (96b)» Unicast/Multicast Fields (determined by IP_MCast_Valid bit (1:MCast, 0:Unicast) (13b)
IP_MCast Valid (1b) MulticastFields (12b)
Plugin/Port Selection Bit (1b):– 0: Send pkt to both Port and Plugin. Does it get the MCast CopyVector?– 1: Send pkt to all Plugin bits set, include MCast CopyVector in data going to plugins
MCast CopyVector (11b)– One bit for each of the 5 ports and 5 plugins and one bit for the XScale, to drop a MCast, set MCast CopyVector to all 0’s
UnicastFields (8b) Drop Bit (1b)
– 0: handle normally– 1: Drop Unicast pkt
Plugin/Port Selection Bit (1b):– 0: Send packet to port indicated by Unicast Output Port field– 1: Send packet to plugin indicated by Unicast Output Plugin field. Unicast Output Port, QID, Stats Index, and NH fields also get sent to plugin
Unicast Output Port (3b): Port or XScale– 0: Port0, 1: Port1, 2: Port2, 3: Port3, 4: Port4
Unicast Output Plugin (3b): – 0: Plugin0, 1: Plugin1, 2: Plugin2, 3: Plugin3, 4: Plugin4– 5: XScale (treated like a plugin)
» QID (16b)» Stats Index (16b)» NH_IP/NH_MAC (48b): At most one of NH_IP or NH_MAC should be valid» Valid Bits (3b): At most one of the following three bits should be set
IP_MCast Valid (1b) (Also included above) NH_IP_Valid (1b) NH_MAC_Valid (1b)
18 - John DeHart - 04/19/23
Primary Filter Primary Filter Lookup Key (140b)
» Port (3b): Can be a wildcard (for Unicast, probably not for Multicast) Value of 111b in Port field to denote coming from the XScale Ports numbered 0-4
» PluginTag (5b): Can be a wildcard (for Unicast, probably not for Multicast) Plugins numberd 0-4
» DAddr (32b)» SAddr (32b)» Protocol (8b)» DPort (16b)» Sport (16b)» TCP Flags (12b)» Exception Bits (16b): Allow for directing of packets based on defined exceptions
Primary Filter Result (104b)» Unicast/Multicast Fields (determined by IP_MCast_Valid bit (1:MCast, 0:Unicast) (13b)
IP_MCast Valid (1b) MulticastFields (12b)
Plugin/Port Selection Bit (1b):– 0: Send pkt to ports and plugins indicated by MCast Copy Vector.– 1: Send pkt to plugin(s) indicated by MCast Copy Vector but not ports and send Plugin(s) the MuticastFields bits
MCast CopyVector (11b)– One bit for each of the 5 ports and 5 plugins and one bit for the XScale, to drop a MCast, set MCast CopyVector to all 0’s
UnicastFields (8b) Drop Bit (1b)
– 0: handle normally– 1: Drop pkt
Plugin/Port Selection Bit (1b):– 0: Send packet to port indicated by Unicast Output Port field– 1: Send packet to plugin indicated by Unicast Output Plugin field. Unicast Output Port, QID, Stats Index, and NH fields also get sent to plugin
Unicast Output Port (3b): Port or XScale– 0: Port0, 1: Port1, 2: Port2, 3: Port3, 4: Port4
Unicast Output Plugin (3b): – 0: Plugin0, 1: Plugin1, 2: Plugin2, 3: Plugin3, 4: Plugin4– 5: XScale (treated like a plugin)
» QID (16b)» Stats Index (16b)» NH IP(32b)/MAC(48b) (48b): At most one of NH_IP or NH_MAC should be valid» Valid Bits (3b): At most one of the following three bits should be set
IP_MCast Valid (1b) (also included above) NH IP Valid (1b) NH MAC Valid (1b)
» Priority (8b)
19 - John DeHart - 04/19/23
Auxiliary Filter Auxiliary Filter Lookup Key (140b)
» Port (3b): Can be a wildcard (for Unicast, probably not for Multicast) Value of 111b in Port field to denote coming from the XScale Ports numbered 0-4
» PluginTag (5b): Can be a wildcard (for Unicast, probably not for Multicast) Plugins numberd 0-4
» DAddr (32b)» SAddr (32b)» Protocol (8b)» DPort (16b)» Sport (16b)» TCP Flags (12b)» Exception Bits (16b)
Allow for directing of packets based on defined exceptions Can be wildcarded.
Auxiliary Filter Lookup Result (93b)» Unicast Fields (8b): (No Multicast fields)
Drop Bit (1b) (Should never actually be set by control software, but keep here for symmetry with other Unicast Fields) 0: handle normally 1: Drop pkt
Plugin/Port Selection Bit (1b): 0: Send packet to port indicated by Unicast Output Port field 1: Send packet to plugin indicated by Unicast Output Plugin field. Unicast Output Port, QID, Stats Index, and NH fields also get sent to plugin
Unicast Output Port (3b): Port or XScale 0: Port0, 1: Port1, 2: Port2, 3: Port3, 4: Port4
Unicast Output Plugin (3b): 0: Plugin0, 1: Plugin1, 2: Plugin2, 3: Plugin3, 4: Plugin4 5: XScale
» QID (16b)» Stats Index (16b)» NH IP(32b)/MAC(48b) (48b): At most one of NH_IP or NH_MAC should be valid» Valid Bits (3b): At most one of the following three bits should be set
NH IP Valid (1b) NH MAC Valid (1b) IP_MCast Valid (1b): Should always be 0 for AF Result
» Sampling bits (2b) : For Aux Filters only 00: “Sample All” 01: Use Random Number generator 1 10: Use Random Number generator 2 11: Use Random Number generator 3