EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
General FPGA to EPICS IOC Communication Protocol
Yuke Tian, Kiman Ha, Joseph Mead
Brookhaven National Lab
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Outline
• Common requirements for FPGA-based equipments• Resources
• FPGA• EPICS
• General IOC-FPGA communication protocol • IOC side• FPGA side
• Test results on PSC• Expected through put for BPM/Cell controller
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Common requirements for FPGA-based equipments
• Why do we need FPGA-based equipments: FPGA vs ASIC ?
For experimental physics equipment, FPGA is a better choices.
FPGA Design Advantages ASIC Design Advantages
Faster time-to-market - no layout, masks or other manufacturing steps are needed
Full custom capability - for design since device is manufactured to design specs
No upfront NRE (non recurring expenses) - costs typically associated with an ASIC design
Lower unit costs - for very high volume designs
Simpler design cycle - due to software that handles much of the routing, placement, and timing
Smaller form factor - since device is manufactured to design specs
More predictable project cycle - due to elimination of potential re-spins, wafer capacities, etc.
Higher raw internal clock speeds
Field reprogramability - a new bitstream can be uploaded remotely
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Common requirements for FPGA-based equipments
FPGA
Logic Control
(V6: up to 100K slices)
External logic (ADC, DAC, etc)
DSP
(V6: up to 800 DSP48)
High speed serial link
(V6: up to 72 GTX /GTH transceiver,
PCIe)
Other system
Embedded CPU
(microblaze: 32 RISC 1K
slice)
External world
0.5Tbps Standard
CPU peripheral
(Ethernet, RS232 etc)
Analog
(V7: 1Mbps ADC)
Analog signal
FPGA for SoC
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Common requirements for FPGA-based equipments
• Why do we use embedded CPU in FPGA (instead of a separate CPU outside FPAG) ?
The embedded CPU and the user defined logic are on the same FPGA chip. They have fast bus interface. All user registers and block memory are just part of the CPU memory space. CPU can easily access them without any other logic control.
The embedded CPU already has many IO peripheral that are ready to communicate with the outside world, such as EPICS IOC.
At NSLS-II, all our custom designed equipments (BPM, cell controller, power supply controller) use the same Xilinx soft-core CPU (microblaze). We need to find a general way for the CPU to communicate with EPICS IOC. The nature choice is through Ethernet port.
Since this is an embedded light weight CPU system, we need to find some protocol simple and reliable. In fact, the embedded CPU’s main task is to communicate with the EPICS IOC. All other logic control, DSP calculation is done in the other fabric of FPGA.
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Common requirements for FPGA-based equipments
EPICS IOC
Small Size Data BulkRx Data BulkTx Data
(ai,ao,di,do etc) (waveform,etc) (waveform, etc)
BPM: gain, calbration BPM: filter coefficients BPM: TBT, ADC raw
CC: matrix selection CC: reverse response matrix CC: 10KHz orbit data
PSC: setpoints, commands PSC: Booster ramping function PSC: ADC readbacks
CA Client
(Physics applications)
Channel Access
FPGA (microBlaze / Xilkernel / LWIP TCP/IP)
TCP/IP
registers SRAM MPMC DDR2/DDR3
How do we design a simple/reliable protocol to transfer data between IOC /FPGA quickly ?
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Resources on FPGA
Speed Cache Memory
BPM/CC(Virtex6)
125MHz Available (32K/32K)
A lot. (total DDR3: 1GB, 64bit wide)
PSC(Spartan 3A)
50MHz Available (2K/2K) A lot. (total DDR2: 256MB, 8bit wide)
EMAC PHY
BPM/CC 1Gbps hardcore TEMAC
1Gbps PHY
PSC 100Mbps soft EMAC 100Mbps PHY
Embedded softcore CPU - microBlaze
Network resource
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Resources on FPGA
Xilkernel: Xilinx kernel supports multithreads, scheduling, semaphore, message queue, buffer memory.
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Resources on FPGA
LWIP TCP/IP stack: Free TCP/IP stack from LWIP community. It support multiple TCP connections,
TCP window size, TX/RX checksum offload, jumbo frame, ARP, DHCPetc.
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
How do we design a simple/reliable protocol to transfer data between IOC /FPGA quickly ?
EPICS IOC
Small Data BulkRx Data BulkTx Data
(ai,ao,di,do etc) (waveform,subArrary etc) (waveform,subArrary etc)
CA Client
(Physics applications)
Channel Access
FPGA (microBlaze / Xilkernel / LWIP TCP/IP)
registers SRAM MPMC DDR2/DDR3
TCP/IP port 1 TCP/IP port 2 TCP/IP port 3
Separate the three data traffic into three sockets
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Resources from EPICS
There are many device drivers to support TCP/IP communication between IOC and network equipments. Once commonly used is asynDriver.
TCP/IP driver
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
General IOC-FPGA communication protocol: IOC side
EPICS IOC
Small Data BulkRx Data BulkTx Data
(ai,ao,di,do etc) (waveform,aSub etc) (waveform,aSub etc)
Out: Combined small data into
one MTU by using aSub.
In: parse one MTU into small
data. waveform record/asynInt8ArrayOut waveform record/asynInt8ArrayInLarge amount data
ID frame1MTU
(10Hz)
1MTU
(10Hz)
ID
frame
(optional)
Large amount data
asyn port 1:
drvAsynIPPortConfigure
(“NormalRxTx", "192.168.1.10:7 TCP",0,0,0)
asyn port 2:
drvAsynIPPortConfigure
(“BulkRx", "192.168.1.10:18 TCP",0,0,0)
asyn port 3:
drvAsynIPPortConfigure
(“BulkTx", "192.168.1.10:20 TCP",0,0,0)
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
General IOC-FPGA communication protocol: FPGA side
Large amount data
ID frame1MTU
(10Hz)
1MTU
(10Hz)
ID
frame
(optional)
Large amount data
asyn port 1:
drvAsynIPPortConfigure
(“NormalRxTx", "192.168.1.10:7 TCP",0,0,0)
asyn port 2:
drvAsynIPPortConfigure
(“BulkRx", "192.168.1.10:5001 TCP",0,0,0)
asyn port 3:
drvAsynIPPortConfigure
(“BulkTx", "192.168.1.10:5000 TCP",0,0,0)
1. Create socket for "192.168.1.10:7” port
2. Listen to the socket
3. If new packet coming, create a thread to
process it.
4. In process thread, read data, copy it,
write response data to socket.
1. Create socket for "192.168.1.10:18” port
2. Listen to the socket
3. If new packet coming, create a thread to
process it.
4. In process thread, keep reading data out of
socket and copying it to CPU memory space.
1. Create socket for "192.168.1.10:20” port
2. Listen to the socket
3. If new packet coming, create a thread to process it.
4. In process thread, keep writing from CPU memory
space to socket.
Total C codes running on FPGA should be simple socket programming.
Since we develop it from scratch, we clearly understand it and we have full control of the kernel and each thread.
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
General IOC-FPGA communication protocol: FPGA side
Init_platform(); //start xilinx platform
xilkernel_init(); //init Xil_kernel
xmk_add_static_thread(main_thread,1); //threw main thread
xilkernel_start(); //start Xil_kernel
Psudo codes:
Main_thread():
lwip_init(); //initialize lwip TCP/IP stack
sys_thread_new("NW_THREAD", network_thread, NULL, THREAD_STACKSIZE,DEFAULT_THREAD_PRIO); //create network thread
return 0;
Network_thread():
IP4_ADDR(&ipaddr, 192,168,1,10); IP4_ADDR(&netmask,255,255,255,0); IP4_ADDR(&gw,192,168,1,1); //set IP, netmask, gateway
xemac_add(netlist,..); netif_set_default(netif); netif_set_up(netif); //add network to netlist, set it as default, setup netlist
sys_thread_new("xemacif_input_thread", (void(*)(void*))xemacif_input_thread,netif,THREAD_STACKSIZE,DEFAULT_THREAD_PRIO); //start pacet rx thread
sys_thread_new(“normalRxTx", normalRxTx_thread, 0,THREAD_STACKSIZE,DEFAULT_THREAD_PRIO);
sys_thread_new(“BulkRx", rx_thread, 0,THREAD_STACKSIZE,DEFAULT_THREAD_PRIO);
sys_thread_new(“BulkTx", tx_thread, 0,THREAD_STACKSIZE,DEFAULT_THREAD_PRIO);
return 0;
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
General IOC-FPGA communication protocol: FPGA side
normalRxTx_thread():
lwip_socket(AF_INET, SOCK_STREAM, 0); //create socket
lwip_bind(sock, (struct sockaddr *)&address, sizeof (address)) < 0); //assign socket address/port to socket
lwip_listen(sock, 5); //willing to listen and queue length
if ((new_sd = lwip_accept(sock, (struct sockaddr *)&remote, (socklen_t *)&size)) > 0) //if new connection established, create a thread to process it
sys_thread_new("echo_server", process_normalRxTx_request,(void*) new_sd,THREAD_STACKSIZE,DEFAULT_THREAD_PRIO);
Psudo codes:
Process_normalRxTx_thread():
Lwip_read(sd, rxTxBuffer, RECV_BUF_SIZE); //read the received data
//depending the ID frame, we copy the received data into different location. This is FPGA memory-map dependent
switch (rxTxBuffer[0] {case 1: memcpy(addRX,rxTxBuffer,readNumber); case 2: …};
writeNumber = lwip_write(sd,rxTxBuffer, readNumber); //write back to socket
BulkRx_thread is similar to normalRxTx_thread except:
-- It binds to different port;
-- In the process thread, it keep read the data from the socket using lwip_read, then copy to some CPU memory space.
BulkTx_thread is similar to normalRxTx_thread except:
-- It binds to different port;
-- In the process thread, it keep copy data from CPU memory space , and write the data to socket using lwip_write.
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Test results on PSC
Only one port (BulkTx) is connected.
Throughput (PSC to IOC): 10*1400*8 bits/0.2ms = 5.6Mbps
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Test results on PSC
Last packet from PSC.
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Test results on PSC
In psc_test.cmd file:
drvAsynIPPortConfigure ("pscNormalRxTx", "192.168.1.10:7 TCP",0,0,0)
drvAsynIPPortConfigure ("pscRx", "192.168.1.10:18 TCP",0,0,0)
drvAsynIPPortConfigure ("pscTx", "192.168.1.10:20 TCP",0,0,0)
All three ports are connected.
BulkTx port (FPGA to IOC)
normalRxTx port
BulkRx port (IOC to FPGA)
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Test results on PSC
All three ports are connected.
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Expected throughput for BPM/Cell Controller
With cache turned on, use jumbo frame, and TX/RX checksum offload, BPM can get TX/RX at 80Mbps throughput with lwip socket mode. This is tested with iperf Our new protocol is similar to iperf. We hope to get this throughput between IOC and BPM/Cell controller. If so, that means:
We might be able to get turn-by-turn data (x/y position) in real time (not on demond anymore).
TBT data: 378KHz * 4byte * 2 (for x,y position) * 8 = 24.2 Mbps
On IOC side, 24.2Mbps * 8BPM/cell = 193.6 Mbps. Our IOC has true GigE connection.
For cell controller, we can get 10KHz data in real time from several cell controllers
10KHz orbit data: 10KHz * 4byte * 2 (for x,y position) * 240 * 8 = 153 Mbps
We need to use a few (for example 5) cell controllers (each one output 20% of the data) to get the 10KHz orbit data.
Higher throughput to EPICS is always good for physics applications.
More applications (LLRF, etc) can be done through this new protocol.
Suggestions from both EPICS and FPGA experts are welcome.
EPICS Collaboration Meeting, Hsinchu, Taiwan, June, 2011
Summary
FPGA is a common choice for accelerator and experimental physics to carry out control system hardware design.
To use embedded CPU in FPGA will simplify the data delivery to EPICS IOC from the hardware level registers (such as ADC, DAC data, or memory data, etc).
Using TCP/IP socket programming protocol provides a simple and reliable data communication between FPGA embedded CPU and EPICS IOC.
On the FPGA side, the protocol is easy to implement. On the EPIOC IOC side, asynDriver provides a perfect solution.