Upload
bob
View
42
Download
1
Tags:
Embed Size (px)
DESCRIPTION
LoopBuster Hardware Loop Detection in Fast Mesh Ethernet Networks. Completion Presentation. Uriel Peled and Tal Kol Guided by Boaz Mizrahi Advised by Gideon Kaempfer Digital Systems Laboratory Faculty of Electrical Engineering, Technion Winter 2007 – Spring 2009. - PowerPoint PPT Presentation
Citation preview
LoopBusterHardware Loop Detection in Fast
Mesh Ethernet Networks
Uriel Peled and Tal Kol
Guided by Boaz MizrahiAdvised by Gideon Kaempfer
Digital Systems LaboratoryDigital Systems LaboratoryFaculty of Electrical Engineering, TechnionFaculty of Electrical Engineering, TechnionWinter 2007 – Spring 2009Winter 2007 – Spring 2009
Completion PresentationCompletion Presentation
Ethernet DrawbacksTree Topologies For Loop PreventionA B
C
LoopBusterStop Loops Without Tree TopologyA B
C
New Hardware Device:
“LoopBuster”
Improved Switches:
Changed Learning
Improved Switches:
Changed Learning
Mesh Topology:
Loops Allowed!
Design ChallengesLoopBuster Device
Support very high throughputsEthernet supports 1Gbps and 10Gbps linksImplementation must be in hardware
Use limited amount of on-chip memory
Naïve implementation requires 10Mbit for a single 10Gbps interface
Minimal effect on hosting networkRemove looping packets quicklyMinimize false positivesRely on existing standards / network equipment
packet
packet
packet
packet
packet
packet
packet
packet
packet
packetFilter Filter Filter
Conceptual DiagramLoopBuster Device
• Low memory filters in decreasing size
• Still effective – packet rate decreases
• We pay with N+1 mandatory loops
Project Milestones
1. Full Network Software Simulationpre-hardware implementation
2. Analyze Algorithm Parameters3. Design LoopBuster Device
macro, micro architectures, Verilog implementation
4. Board Bring-up board selection, cores, GbE
5. Testing, Validation and Debug
6. Demonstrate Working Prototype
Algorithm ParametersPerformance-Cost Tradeoffs
Number of filter chainsParallel chains for different MLTTsDifferent chains for different traffic types
Number of filters in each chainNumber of loops before terminating a packet
Memory size of each filter in each chain
Minimize false positives while saving memory
Filter scheduling and control patternsLock / unlock states, reset filter memory
Network Simulation in SoftwareMinimal HW (1PC), C++, SW Timeline
Real-world rates, real-world traffic
Based on a genetic algorithmGenetic representation: filter size list (13,12,10,10,9,9,8,8,7,7,6)
Improve a pre-defined fitness functionFalse positives over real traffic + Total memory size
Two-stage mutationGeneral (add/remove filter, change filter size, switch filters)
Specific (num filters, size of largest, create descending chain)
Algorithm AnalysisEmpirical Param Selection
Software Simulation
Hosting PC
LoopBuster
LBP Switch
PCAP Generator Pinger
Physical Computer
VMWare Computer
PCAP ProxyPCAP Proxy
LBP Switch
LBP Switch
Pinger
Theoretical algorithm performance analysisProbability model for a filter / filter chain
C collisions with X packets through an N-bit filter
Occupancy problem, numerical solution in C++
Algorithm AnalysisTheoretical Param Selection
0 0.5 1 1.5 2 2.5 30
1
2
3
4
5
6x 10
4 Memory in Bits
MLTT (ms)
EmpiricalTheoretical
1300 1320 1340 1360 1380 1400 1420 1440 1460 1480
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.02
Probability (X=3906, N=4096)
Collisions
Modular Filter Chain DesignTraffic sensitive
Code / Runtime Configurable Parameters
UART controlled lock/unlock/reset patterns
2 Clock Domains125 MHz (GbE), 31 MHz (Processing)
Packet Pipeline ProcessingNo store and forward (untraditional MAC)Hardware only (no Power PC)
Preliminary Design Decision
Algorithm Analysis Conclusions
ArchitectureBoard Block Diagram
FPGAPHY
Rocket I/Ointerface
Ethernet RJ45 Rocket
I/OPHY
Ethernet RJ45Rocket
I/O
Rocket I/Ointerface
UARTRS232
UART Interface
Programmable LVDS Clock Source
CLK
ArchitectureGeneral Block Diagram FPGAPHY
Rocket I/Ointerface
Ethernet RJ45 Rocket
I/OPHY
Ethernet RJ45Rocket
I/O
Rocket I/Ointerface
UARTRS232
UART Interface
Packet Remover
Packet Transceiver(in, pt_rx)
LoopBuster Array
LoopBuster Controller
Config Statistics
Packet Remover
Packet Transceiver(out, pt_tx)
Ethernet RJ45
Ethernet RJ45
UARTRS232
Data Pipe
PHY
Ethernet 1000BA
SE-X PCS/PMA
CORE
Rocket I/Ointerface
GMII TX
Ethernet RJ45
pt_rxdata
RocketI/O
pt_txdata
GMII RX
Packet Remover
Packet Transceiver(in, pt_rx)
LoopBuster Array
LoopBuster Controller
Config Statistics
Packet Remover
Packet Transceiver(out, pt_tx)
Ethernet RJ45
Ethernet RJ45
UARTRS232
Data PipePacket Transceiver Block Diagram
125 MHz clock boundary
Ethernet data in 8-bit units
31 MHz clock boundary
Ethernet data in 32-bit units
PHY
Ethernet 1000BA
SE-X PCS/PMA
CORE
Rocket I/Ointerface
GMII TX
Ethernet RJ45
pt_rxdata
RocketI/O
pt_txdata
GMII RX
PHY
Ethernet 1000BA
SE-X PCS/PMA
CORE
Rocket I/Ointerface
GMII TX
Ethernet RJ45
pt_rxdata
RocketI/O
pt_txdata
GMII RX
LoopBuster FilterBlock Diagram
1-bit
locked , memory reset
suspect
lb_filter_memory
BRAM
32-bit
lb_filter_state_machine
cycles_left
current_active
lb_filter_state_machine
cycles_left
suspect
signaturesignature
Control
CRClb_crc
lb_single_loopbuster
lb_single_loopbuster
data
signature
lb_controlctl
filter control
filter control
suspect
suspect
LoopBuster FilterImplementation Control
CRClb_crc
lb_single_loopbuster
lb_single_loopbuster
data
signature
lb_controlctl
filter control
filter control
suspect
suspect
Supports two concurrent packet pathsUnique clock domain: 125 MHzlb_filter_memory (Memory)
Filter BRAM wrapper (2 asynchronous ports)
Wide write port for asynchronous reset (FSM)
Narrow read/write port for filter memory access
Supports lock/unlock states
lb_filter_state_machine (Filter Logic)Mutual exclusion for memory access
Fine-grained locking (cycle requirement per state)
Board SelectionSelected Board
Memec FF1152 Xilinx Virtex-II Pro
Existing in lab ($0)
2 SFP Modules1Gbps Eth. RJ45
Gidi (~$200)
PCS/PMA CoreRequired for SFP
Free from Xilinx ($0)
Board Functionally TestDownload a full working Ethernet example project to test UART, SFPs, LEDs, FPGA, cable correct operation
Working DCMSynthesize a working DCM with 125Mhz, 31.5Mhz clock trees. Output main control signals to LEDs (ticker, locked)
Core Linkup TestConfigure Xilinx gig_eth_pcs_pma CORE for 1GE functionality with correct parameters, timing constrains and physical locs
Output CORE status signals to detect linkup
Loopback TestsPerform CORE loopback test, Packet transceiver loopback, full data path loopback
GbE Board Bring-UpVertical Development Stages
Behavioral SimulationsPre-synthesis Verilog for logic functionality on ModelSim
Post-Route Timing SimulationsPost-synthesis Verilog for timing on Xilinx ISE 9.2
Automatic Simulation Test-benchScript-based scenario test-benches for core modules (like pt_rx)
Automated ModelSim with debug textual log file ($fdisplay)
On-Board Live DebuggingStatus signals to LED, R/W of debug registers with UART
Custom event-based debug code (output to UART)
System Integration TestingStream raw Ethernet traffic through device, Packetyzer sniffer
Testing and Validation
Planned / Actual ScheduleFull Network Software SimulationPLAN: 2 months ACTUAL: 2 months
Analyze Algorithm ParametersPLAN: 1 month ACTUAL: 1 month
Design LoopBuster Device (micro-macro)PLAN: 3 months ACTUAL: 3 months
Board Bring-upPLAN: 1 month ACTUAL: 3 months Testing, Validation and DebugPLAN: 2 months ACTUAL: 4 months Demonstrate Working PrototypePLAN: 1 month ACTUAL: 1 month TOTALPLAN: 10 months ACTUAL: 14 months
Main Project AchievementsAlgorithm analysis and params based on software simulations
Working LoopBuster prototype in hardware (FPGA)
Testing and validation environment
LoopBuster-PC communication and control platform
Further WorkLBP switch implementation with revised learning algorithm
Complete network solution demonstration in hardware
In-depth LoopBuster algorithm analysis and optimization with hardware-based results
Achievements and Further Work