View
47
Download
0
Category
Preview:
DESCRIPTION
Beam Secondary Shower Acquisition System: Igloo2 GBT Implementation tests at 5Gbps. Student Meeting Jose Luis Sirvent PhD. Student 09/06/2014. 0. The Set-up 0.1 Material used and system status:. - PowerPoint PPT Presentation
Citation preview
1
Beam Secondary Shower Acquisition System: Igloo2 GBT Implementation tests at 5Gbps
Student MeetingJose Luis Sirvent
PhD. Student09/06/2014
2
0. The Set-up0.1 Material used and system status:
• The modified version of GBT_FPGA for Igloo2 was finally implemented correctly @ 2.5Gbps & 5Gbps.
• Some constraints have to be optimized for correct operation of 5Gbps after every compilation.
• The GBT Firmware was finally organized & commented properly, including an error counter and Boards auto-detection.
• The Console Application was modified and re-structured to include the error counter.
• Tullio already offered a very valuable information from his measurements.
• Needed to verify timing details to check if we can recover the LHC clock on the front-end system. (Study the recovered Clk phase, link latency and ref frequency tolerance TX RX).
3
0.2 GBT-FPGA Overview in Igloo2:(Clock Management as in Virtex 6)
Tx_CLK (240MHz)
Rx_CLK (240MHz)
TX_Data_P (4.8Gbps)TX_Data_N (4.8Gbps)
Rx_Word (20 bits)@ 240 MHz
TX_Frame_CLK (40MHz)
Data_In (84 bits)@40Mhz
TX_Word_CLK (240MHz)
RefCLK1_P (120MHz)RefCLK1_N (120MHz)
RX_Data_P (4.8Gbps)RX_Data_N (4.8Gbps)
GBT_TXScramblerEncoderGearbox
Tx_Word (20 bits)@ 240MHz
GBT_MGTSERDES_0
Vendor Specific IP
TX_PLL
GBT_RXGearboxDecoder
Descrambler
Data_Out (84 bits)@ 40Mhz
GBT_BANK (Very simplified view)
RX_Frame_CLK (40MHz)
RX_Word_CLK (240MHz)
RX_PLL
SERDES_INIT_MASTERAPB_BUS
4
0.3 Clock relationships:Understanding some details of the clock recovery
5
1. Clock Recovery (Phase)1.1 Word_CLK (SERDES TX_CLK & RX_CLK 250Mhz @ 5Gbps)
Yellow EPCS_TX_CLK (Board 1)Blue EPCS_RX_CLK (Board 2)
Test performed resetting every timethe boards:
• Observed random phase difference• There are 20 possible lock steps• Possible to know RX_WORD_CLK
phase based on RX_BITSLIP_NUMBER.
• Study the possibility to tune in the SERDES the EPCS_RX_CLK internally as done in Virtex 6
• Similar results obtained when doing manual RX Reset and CDR Lock steps.
6
1. Clock Recovery (Phase)3.2 Checking that the Word_CLK’s phase difference depends of RX_BITSLIP
The test were done resetting the boards:
• As spected there is a high correlation between RX_BITSLIP_NUMBER and the phase difference between TX_WORD_CLK (Board #1) and RX_WORD_CLK (Board #2)
• The Phase 0 - 2 pi corresponds to 0 - 19 bit displacement.
• The measurements were taken using the scope cursors, so there are some measurement error, but the tendency is clear.
7
1. Clock Recovery (Phase)1.2 Frame_CLK (41.6Mhz @ 5Gbps)
Yellow FRAME_TX_CLK (Board 1)Blue FRAME_RX_CLK (Board 2)
These are the clocks after the TX & RXPLLs, based on EPCS_ TX & RX _CLK
The test were done resetting the boards:• As previously the phase variation is
random.• This would be our CLK for acquisition
electronics.• The PLL can lock on any of the 6 rising
edges of RX_WORD_CLK.• Possible to optimize phase variations
adjusting the PLL CLK phase on based on RX_HEADER_FLAG.
• To be seen if only adjusting Frame_Clk the link provides deterministic latency.
8
2. Link latency (5Gbps)2.1 Observing the Match_Flag signal (Memory based Gearbox)
Yellow TX_MATCH_FLAG (Board 1)Blue RX_MATCH_FLAG (Board 2)
These are flags that goes to 1 when certain frame is detected
The test were done resetting the boards:• As expected the latency is not
deterministic, with a pseudorandom behaviour.
• The Phase differences on the Word & Frame Clocks are the responsible of this delay variation.
• If we are able to align the clocks properly the link delay would be “more stable”.
9
2. Link latency (5Gbps)2.1 Observing the Match_Flag signal (Register based Gearbox)
Yellow TX_MATCH_FLAG (Board 1)Blue RX_MATCH_FLAG (Board 2)
These are flags that goes to 1 when certain frame is detected
The test were done resetting the boards:
• As expected the latency is not deterministic, with a pseudorandom behaviour.
• The Phase differences on the Word & Frame Clocks are the responsible of this delay variation.
• If we are able to align the clocks properly the link delay would be “more stable”.
10
3. RX_Header Flag (5Gbps)3.1 Comparing phase respect Board1 TX_Word_CLK (Register based Gearbox)
Yellow EPCS_TX_CLK (Board 1)Blue RX_HEADER_FLAG (Board 2)
RX_HEADER_FLAG goes to 1 when the word received contains the GBT Header.
The test were done resetting the boards:• As expected the latency is not
deterministic, with a pseudorandom behaviour.
• The Phase difference on the Word clocks is the responsible of this delay variation (20 possible phases).
• If we are able to align the word clocks properly the phase would be constant.
11
3. RX_Header Flag (5Gbps)3.2 Checking that the RX_HEADER phase depends of RX_BITSLIP
The test were done resetting the boards:
• As done previously with the WORD_CLK’s.
• RX_HEADER_CLK status it’s updated on the RX_WORD_CLK rising edges.
• RX_HEADER_CLK is used as a sync signal in Virtex 6 version to align RX_FRAME_CLK (in that example the RX_WORD_CLK has been already tunned based on BITSLIP_Number)
• Measurements taken more carefully, clear tendency.
12
3. Ref. Frequency tolerance3.0 How to transmit LHC Clock to the Front-End Electronics
TX_PLLx3
LHC Clock 40.079Mhz
TX_SERDES
SERDES Ref_Clock 120.237Mhz
Surface FPGA Tunnel FPGA
RX_SERDES RX_PLL/6
Optial LinkGBT @ 4.809Gbps
SERDES EPCS_RX_CLK240.474Mhz(Word_Clock)
Local SERDES REF_CLK
120Mhz
LHC Clock 40.079Mhz
(RX_Frame_Clock)
• It will be possible to work with such difference on the REF_CLKs?• Do we recover the correct EPCS_RX_CLK when the REF_CLK are different?• Which are the difference limits on REF_CLKs?• If the TX REF_CLK varies during transmission the link will suffer errors?
13
• LHC operations require a clock tolerance of 50 ppm (the LHC clock varies during operation).
• We’ll play with Ref_CLK’s around 125Mhz (the local osc of the Dev.Kit)
• A safe region was found for GBT @ 2.5 & 5Gbps– Board 1 125.0 Mhz– Board 2 125.0 ± 0.5 Mhz– Difference 0.8% 8000 ppm >> 50 ppm (we met the
specs)
• Tullio’s measures– EPCS_DEMO example @ 3.125Gbps – Difference 2.3Mhz 1.8% 18400ppm
3. Ref. Frequency tolerance3.1 Link performance over Ref_Clk differences on TX & RX
Board 1Local Ref_CLK
125Mhz
Board 2Variable Ref_CLK
123.5 – 126.5 Mhz
14
• LHC operations require a clock tolerance of 50 ppm (the LHC clock varies during operation).
• We’ll play with Ref_CLK’s around 125Mhz (the local osc of the Dev.Kit)
• A safe region was found for GBT @ 2.5 & 5Gbps– Board 1 125.0 Mhz– Board 2 125.0 ± 0.5 Mhz– Difference 0.8% 8000 ppm >> 50 ppm (we met the
specs)
• Tullio’s measures– EPCS_DEMO example @ 3.125Gbps – Difference 2.3Mhz 1.8% 18400ppm
3. Ref. Frequency tolerance3.1 Link performance over Ref_Clk differences on TX & RX
Board 1Local Ref_CLK
125Mhz
Board 2Variable Ref_CLK
123.5 – 126.5 Mhz
SAME RESULTS FOR 2.5 AND 5Gbps
15
4. Clocks Frequency Stability4.1 Freq Histogram of the TX & RX Word CLKs (250Mhz)
EPCS_TX_CLK (Board 1)
EPCS_RX_CLK (Board 2)
EPCS_TX_CLK (Board 2)
EPCS_RX_CLK (Board 1)
• Clocks Stability: – For Both clocks (TX_WORD_CLK & RX_WORD_CLK),
independently of their Ref_CLK source it was observed certain frequency dispersion.
– EPCS_TX_CLK STDEV ~ 6.1MHz (24400ppm)– Frame_TX_CLK STDEV ~ 41kHz (985ppm)
– EPCS_RX_CLK STDEV ~ 3.7MHz (14800ppm)– Frame_RX_CLK STDEV ~ 45kHz (1081ppm)
• Possible Reasons:– Not really well understood– Part of the SERDES Specifications?– More care needed on routing/ Implementation?
• Some questions to think about:– Impact on system performance?– Could we tolerate this variations?
16
4. Clocks Frequency Stability4.2 Freq Histogram of the TX & RX Frame CLKs ( After PLLs 41.6Mhz)
Frame_TX_CLK (Board 1)
Frame_RX_CLK (Board 2)
Frame_TX_CLK(Board 2)
Frame_RX_CLK (Board 1)
• Clocks Stability: – For Both clocks (TX_WORD_CLK & RX_WORD_CLK),
independently of their Ref_CLK source it was observed certain frequency dispersion.
– EPCS_TX_CLK STDEV ~ 6.1MHz (24400ppm)– Frame_TX_CLK STDEV ~ 41kHz (985ppm)
– EPCS_RX_CLK STDEV ~ 3.7MHz (14800ppm)– Frame_RX_CLK STDEV ~ 45kHz (1081ppm)
• Possible Reasons:– Not really well understood– Part of the SERDES Specifications?– More care needed on routing/ Implementation?
• Some questions to think about:– Impact on system performance?– Could we tolerate this variations?
17
5.Compilation Resources5.1 Comparison using Memory & Register Based Gearbox
Memory Gearbox Register Gearbox
• The GBT_FPGA code was compiled successfully on Igloo2 with the following resource usage:
• No Clock phase alignment has been yet performed for the on Igloo2 version.
• Implemented using the two gearbox possibilities for clock domain crossing available on Xilinx V6 example.
• Memory Gearbox : Two port memories • Register Gearbox : Through registers.
• Still TODO Aligh the FRAME_CLK • (We’ll use the information of BITSLIPNUMBER for and
RX_HEADER_FLAG for this)
18
6.Constraining the Design6.1 On Synplify Pro & on Smart-Time editor.
Constraint Clocks on Synplify
Constraint Clocks on Smart-Time
Allow multicycle path onSmart-Time (Between TX_FIFO & SERDES_TX)
19
7. Synchronizing SERDES, GBT_TX & GBT_TX7.1 Maintaining CLK/DATA phase relationships• Buffered EPC_TX_CLK & EPCS_RX_CLK:
– Helps to reduce the CLK Skew through global resources (and therefore reach the clk contraints for GBT_TX & GBT_RX)– However the buffered VS non buffered versions have different phases.– Synchronization CLK / TX & RX Data needed (done through Registers & Delay lines) File: Igloo2_gtx_std.vhdl
Adjustable Delay
20
7. Synchronizing SERDES, GBT_TX & GBT_TX7.1 Maintaining CLK/DATA phase relationships• Buffered EPC_TX_CLK & EPCS_RX_CLK:
– Helps to reduce the CLK Skew through global resources (and therefore reach the clk contraints for GBT_TX & GBT_RX)– However the buffered VS non buffered versions have different phases.– Synchronization CLK / TX & RX Data needed (done through Registers & Delay lines) File: Igloo2_gtx_std.vhdl
MultiCycle path (2)
Adjustable Delay
Adjustable Delay
21
7. Synchronizing SERDES, GBT_TX & GBT_TX7.1 Maintaining CLK/DATA phase relationships• Buffered EPC_TX_CLK & EPCS_RX_CLK:
– Helps to reduce the CLK Skew through global resources (and therefore reach the clk contraints for GBT_TX & GBT_RX)– However the buffered VS non buffered versions have different phases.– Synchronization CLK / TX & RX Data needed (done through Registers & Delay lines) File: Igloo2_gtx_std.vhdl
MultiCycle path (2)
Adjustable Delay
Adjustable Delay
EPCS_TX_CLK
EPCS_TX_CLK_BUFF
GBT_TX_DATA
GBT_TX_DATA_DELAYED
22
7. Synchronizing SERDES, GBT_TX & GBT_TX7.1 Maintaining CLK/DATA phase relationships• Buffered EPC_TX_CLK & EPCS_RX_CLK:
– Helps to reduce the CLK Skew through global resources (and therefore reach the clk contraints for GBT_TX & GBT_RX)– However the buffered VS non buffered versions have different phases.– Synchronization CLK / TX & RX Data needed (done through Registers & Delay lines) File: Igloo2_gtx_std.vhdl
MultiCycle path (2)
Adjustable Delay
Adjustable Delay
EPCS_RX_CLK
EPCS_RX_CLK_BUFF
EPCS_RX_DATA
EPCS_RX_DATA_DELAYED
MultiCycle path (2)
23
8.Post-Layout Implementation Report8.1 Reaching constraints
• We have to play a bit with the TX & RX delays in order to respect the Max & Min Delay analysis on Smart-Time.
• NOTE!!: Every time the design is changed and a new compilation (and place&route) is performed the reports change, and under certain circumstances the constraints are not met, therefore more efforts on design constraining have to be done (or we leave this as it is now and we’ll take care about it later when the final design is done).– Currently Experimenting some difficulties to reach the constraints… new ideas are more than welcome!!– Maybe bigger Igloo2 FPGAS with more resources can reach easier the specifications (or at least with bigger margin).– In spite of the limitations on WORD_CLK_RX & WORD_CLK_TX at 241Mhz / 246Mhz, the test were done at 250Mhz
with satisfactory result.
24
9. The beauty is inside: GBT_On_Igloo2 9.1 All the implementation
PLLs
25
9. The beauty is inside: GBT_On_Igloo29.2 GBT_BANK
26
9. The beauty is inside: GBT_On_Igloo2 9.3 GBT_BANK/GBT_TX
27
9. The beauty is inside: GBT_On_Igloo2 9.4 GBT_BANK/GBT_RX
28
9. The beauty is inside: GBT_On_Igloo29.5 GBT_BANK/MGT
Global Clock Network
29
10. Code Availability10.1 First release on Dropbox/SVN• First release STD version available for download (09/06/2014) :
– In DropBox: • https://www.dropbox.com/sh/5sjvdkp9wwykpz4/AADrIrhM5BKj_zcraAYGxnURa
– In SVN (Use Tortoise or other SVN client): • https://svn.cern.ch/reps/be-bi-bl/electronics/bwsdev/studies/BWS_pCVD Diamond detector Readout
Electronics/GBT_On_Igloo2/Firmware/GBT_FPGA_Igloo2/STD
• What is provided:– Libero 11.3 Project with GBT_on_Igloo2 code: (2014_06_09_GBT_On_Igloo2_M2GL_EVAL_KIT.rar)
• Features: GBT Protocol (STD Version) on Igloo2 with UART communication through USB port. Constraints are not always met, so care must be taken when new changes are performed analysing timing reports.
• All the necessary VHDL files are in : GBT_On_Igloo2_M2GL_EVAL_KIT\hdl• Programming file (stp) available in : GBT_On_Igloo2_M2GL_EVAL_KIT\designer\GBT_On_Igloo2_M2GL_EVAL_KIT\
export• In case of trouble, just let me know!! jsirvent@cern.ch there are many things to improve.
– Console Application UART_APP_V3.0: (2014_06_09_UART_APP_V3.0.rar)• Features: Controls the workflow of the GBT implementation on Igloo2 and checks different signal values and parameters of the link
(RX_BITSLIP_NUMBER, Error number…), Boards Auto-Detection.• Microsoft Visual Studio 2008 Project: UART_APP_V3.0\UART_APP.sln• Executable File: UART_APP_V3.0\Release\UART_APP.exe
– Readme File: (2014_06_09_Readme.pdf)• Features: Short guide to implement the design on the Dev.Kit and run the application
Recommended