25
FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

Embed Size (px)

Citation preview

Page 1: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

FPGA IP Verification for Use in Severe Environments

2005 MAPLD International Conference

September 2005

Paper #237

Ian Land

Ian Bryant

Page 2: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 2

Trends

Smaller geometries allow more functions

Synthesizable HDL makes design-reuse practical

Gate-level design is difficult with high density Resource-intensive Takes a long time Increases likelihood of error

Thus, block-level design is needed Intellectual property (IP) reduces effort and risk, if done right…

A robust design process is followed, with thorough verification IP is proven in many applications, including space & severe

environments A MIL-STD-1553 example demonstrates

Page 3: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 3

Robust Design Process

Structured design flow should be phase-gated Proposal

Justification for development and creation of the project plan Definition and Planning

Preliminary datasheet creation defining the core Test plan is needed

Development The core is implemented and deliverables are created

Verification and Validation Testing against plan and specification (ie. MIL-STD-1553; PCI)

Release Release of product for volume sales

Configuration Management, Feedback and Revision

Phase Gate

Page 4: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 11

MIL-STD-1553 Example Actel has developed three products

A full-featured BC, RT, MT A ‘simple’ bus controller A ‘simple’ remote terminal

Highlight: the simple remote terminal, Core1553BRT Originally released in 2002 (first production August, 2002)

12 and 16 MHz version Updated for minor changes in 12/2002

Loop back test, version text in code, etc. Updated for Verilog translation issue in April 2004 Updated in 9/2004 and 11/2004 to work with design tool updates Revised to include 20 and 24 MHz versions in January 2005

Manchester encoders/decoders tested as part of full-featured BC, RT, MT ProASIC3/E FPGA Family support added

Page 5: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 12

Mil-STD-1553 RT DevelopmentProposal

Substantial customer demand for MIL-STD-1553 bus interface Review of specification and competitive products suggested we

could improve market offerings with rad-tolerant 1553 FPGA

Definition MIL-STD-1553 Specification Preliminary datasheet highlighting the features in the proposal

Development Developed remote terminal Paid careful attention to Manchester encoder/decoder blocks

that would be re-used across product family Built two testbenches

Verification – runs full set of tests and mimics validation User – runs fewer tests for incorporation into larger system design

Page 6: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 13

RT Development, p.2 Verification and Validation

Stable, tested code with reviewed test results Check corner cases and key parameters

Make sure parity errors injected on every bit 12 and 16 MHz; 12 is the harder case due to clock extraction Tested against existing MIL-STD-1553 COTS tester and

Certified Development Kit at Test Systems, Inc. Completely for 16 MHz and partially for 12 MHz Validated Core1553 Evaluation Board

This is important to use with the verification test bench for future updates

Release gives first-rate integration Core builds complete, board release, release note, user guide, data

sheet, certification papers

Solution improves integration Developed application note, reference design and example designs

since 2002

Page 7: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 14

Updates for Speed and Space

Added 20 and 24 MHz in early 2005 (v2.2) Manchesters validated in full-featured BC, RT, MT core Moved CLKSPD generic to 2-bit input port

Allows single netlist to support four frequencies Modified top-level and backend timers Updated test benches for 20 and 24 MHz and port maps Fixed erroneous SYNCOUT pulses

Occur with some non-Actel transmitters on the bus

Updating for space in late 2005 (v3.0) Protect the core from entering illegal states Hardware test for a babbling transmitter Re-qualify the core at Test Systems, Inc.

Page 8: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 15

Severe Environment Considerations Level 3 verification minimum; level 4 validation

MIL-STD-1553 cores have 3rd-party review at Test Systems, Inc. Requires a validation report review - actions and responses Have a certification envelope - test VHDL & Verilog versions at different speeds

Have exceptional documentation and support Tool flow documented with versions for exact design replication Minimize possibility of integration engineer problems

High coverage standards and well-explained variances Code coverage target of 100% for RTL Consider using error detection and correction for memory

Protect the core from entering illegal states and memory upsets Synplicity default could lock if SEU upset Adds redundancy and reduces risk Use EDAC for memory

Avoid the possibility of a babbling transmitter Can occur if failure of redundant system

Continuously investigate other means to improve quality Over-sampling The need for incorporating DO-254

Page 9: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 17

MIL-STD-1553B Tool Issues

Limit tools and document for validated cores Version 3.0 core will be qualified in hardware with

Synplicity 8.1 used for synthesis Designer 6.2 used for layout ModelSim 6.0c Actel OEM used for simulation

So is what happens if a customer uses Exemplar, or even Synplicity 7.71 The qualification is not repeatable…

The customer still needs to qualify their system

IP vendors should document what tool versions are used for qualified IP cores to be used in severe environments for Repeatability Re-use

Page 10: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 18

Code Coverage

A way to prove that the test benches actually test all the designed in functions Allows to verify that all lines of code covered Today’s tools allow

Statement coverage Branch coverage Condition Coverage Expression Coverage Toggle Coverage

BUT Does not guarantee that the design actually implements the

specification Both the core and testbench may not include a function

Page 11: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 19

Core1553BRT Code Coverage Modular core design allows us to create tests to exercise a

particular portion of code

Verification Testbench reaches >99% Non covered lines are inspected and verified, typically conversion

functions or branches in code that are coded purely for safety

Page 12: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 20

Coverage is Actually 100%Branch coverage does not show 100%, but it is.

The reason is that we have safe coding, that checks conditions before it does stuff, these conditions are always true but the code is better and safer with these statements. Some others are

when INIT => case MUXSEL is

when "000" => DSTATE <= WRITE0; -- RX Mode Code

when "010" => DSTATE <= TXSTAT; -- TX Mode Code

when "001" => DSTATE <= WRITE0; -- RX Data Transfer

when "011" => DSTATE <= TXSTAT; -- TX Data Transfer

when "100" => DSTATE <= WRITE0; -- Bcast RX Mode Code

when "110" => DSTATE <= MSGSTAT; -- Bcast TX Mode Code

LATCHSW <= '1';

when "101" => DSTATE <= WRITE0; -- Bcast RX Data Transfer

when "111" => DSTATE <= MSGSTAT; -- Bcast TX Data Transfer

LATCHSW <= '1';

when others =>

end case;

We never do the others, because we list valid states 0-7 above, but the VHDL language requires us to cover all possible states including "ZZZ" in std_logic, this could be rewritten as -- which would give 100% coverage but whose meaning is not so obvious !

when INIT => case MUXSEL is

when "000" => DSTATE <= WRITE0; -- RX Mode Code

when "010" => DSTATE <= TXSTAT; -- TX Mode Code

when "001" => DSTATE <= WRITE0; -- RX Data Transfer

when "011" => DSTATE <= TXSTAT; -- TX Data Transfer

when "100" => DSTATE <= WRITE0; -- Bcast RX Mode Code

when "110" => DSTATE <= MSGSTAT; -- Bcast TX Mode Code

LATCHSW <= '1';

when "101" => DSTATE <= WRITE0; -- Bcast RX Data Transfer

when others => DSTATE <= MSGSTAT; -- Bcast TX Data Transfer

LATCHSW <= '1';

end case;

There is a trade off here between coverage and readability In the first example its understandable what the 111 condition does,

no so in the second ? They synthesize to the same circuit

Page 13: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 21

CoverageFrom 99% to 100%

Getting the last 1% of coverage is time consuming Especially in designs that include lots of error detection and

recovery logic Often in attempting to do this you will by accident force the

design into an unexpected state that highlights an issue

Core1553BRT In going from 99% to 100% we discovered that when we are

transmitting and verifying the loop backed data - if the last word of a burst (Data or Status) contained all zeros and a Manchester error was introduced by the transceiver then we did not detect the error We did detect just Manchester errors We did detect just data errors

Additional tests now added to test benches to verify this in all future releases.

Page 14: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 22

Safe State Machines Although space FPGA’s incorporate redundancy though triple

flip flops and voting, RTL code also needs to be safe

Commercial FPGA synthesis tools can generate ‘unsafe’ state machines Optimized for small area or speed One - hot state machines by default Some have option of Safe State machines

Make sure all illegal states are covered

BUT HOW DO YOU PROVE IT IS SAFE? For example, beware of hidden illegal conditions in the code like

counters that count to a value and reset What happens if the count toggles to a value > the reset condition?

In reality - design redundancy in and test it Fix the state encoding Synthesis tool independent Make test benches to force illegal states

Page 15: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 23

Safe State MachinesDesign Hard Code states using bit_vectors

Make sure all 2**N values specified In the Case statement

Do not use others clause, list all states. Simulator will warn if you’ve forgotten any states

Using bit_vector means that you need not worry about the ‘X’ and ‘Z’ branches in the case

In Illegal States Clear critical signals

e.g. Transmit enable Send FSM back to IDLE state

Create a FSM_ERROR output One for each state machine

Synthesis Make sure state registers are not

duplicated, if they are you may not detect the illegal state

Make sure any FSM optimization in the Synthesis tool is disabled

-- RT Data word transfers signals

-- Hard encoded for safe state machines

signal DSTATE : bit_vector(3 downto 0);

constant IDLE : bit_vector(3 downto 0) := "0000";

…..

constant ALLDONE : bit_vector(3 downto 0) := "1100";

constant UNUSED0 : bit_vector(3 downto 0) := "1101";

constant UNUSED1 : bit_vector(3 downto 0) := "1110";

constant UNUSED2 : bit_vector(3 downto 0) := "1111";

attribute syn_preserve of DSTATE : signal is true;

attribute syn_encoding of DSTATE : signal is "orginal";

attribute syn_replicate of DSTATE : signal is false;

Case DSTATE is

….

when UNUSED0 | UNUSED1 | UNUSED2 =>

FSMD_ERROR <= '1';

DSTATE <= IDLE; -- clear critical controls

BENDREQ <= '0';

ENC_STB <= '0';

DBUSY <= '0';

CMDDONE <= '0';

end case;

Page 16: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 24

Safe State MachinesTesting How do you prove that the resultant netlist includes the safe state machine ?

Identify the STATE registers in the netlist. Using the simulator force the state register to all states

Reset core after each test to prevent side effects of forcing states Verify that the FSM_ERROR output is asserted

printf("Testing Main State Machine - 16 states, 13-15 Illegal");

for state in 0 to 15 loop

resetcore(RSTNOW,CLK16);

printf(" Testing State %d : Restart by typing : do forcefsm.do 0 %04b",fmt(state)&fmt(state));

assert FALSE report "Ignore ERROR, restart simulation ^^^^^^" severity ERROR ;

-- before restarting state machine is forced to the illegal state

wait for 1 us; -- allow time for tcl script to force error

check_state(state, (state>=13), status, ERR);

end loop;

resetcore(RSTNOW,CLK16);

---------------------------------------------------------------------------------------------------------------------------------------- force -deposit sim:/tbench/u12__0/uut1/DSTATE_3/Q $state_bit3 0 force -deposit sim:/tbench/u12__0/uut1/DSTATE_2/Q $state_bit2 0 force -deposit sim:/tbench/u12__0/uut1/DSTATE_1/Q $state_bit1 0 force -deposit sim:/tbench/u12__0/uut1/DSTATE_0/Q $state_bit0 0

Page 17: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 25

Safe State Machines Results and Memory Protection

Has an effect on gate count and performance compared to normal implementation flows

Causes a 7% increase in gate count Causes a 1% drop in performance

But still fits in device and meets performance requirements

Memory Usage Make sure that EDAC memory is used,

Consider about scrub rates, etc. Avoid memory because it is more easily upset by radiation

Page 18: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 26

What is a ‘Babbling’ Transmitter?

Requirements All RT’s are required to monitor outputs to detect if they are

babbling and if so stop, referred to as a Fail Safe Timer If detected by the bus controller it sends a message to the

terminal using the other bus to stop the babbling transmitter

How can a RT babble? Two errors (failures) have to occur within the terminal:

1. The logic that controlled the enable signal to the transmitter has to fail, and second,

2. The terminal's fail-safe timer (maximum of 800.0 microseconds) has to have failed. Some designs use a digital counter for the fail-safe timer, a single

failure in a clock line could cause a babbling transmitter

Page 19: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 27

Avoid Babbling Transmitter Design

Transmit Timeout MIL-STD-1553 requires that a

separate circuit monitors the transmissions and stops the transmitter if a babbling transmission is detected i.e. greater than 33 words transmitted

Even though the protocol state machines may never theoretically cause this, it is a requirement to include this logic

Separate circuit that monitors the Transmit enables and detects if active for greater than 680us If triggers, then enable to external

transceiver is disabled and error condition generated.

process(CLKSPD)

begin

case CLKSPD is

when "00" => HWTIMVALUE <= "0100001"; -- 12MHz

when "01" => HWTIMVALUE <= "0101011"; -- 16MHz

when "10" => HWTIMVALUE <= "0110110"; -- 20MHz

when others => HWTIMVALUE <= "1000001"; -- 24MHz

end case;

end process;

PTXTTIM: process(CLK,RSTn)

variable TXT_TIMER : std_logic_vector(14 downto 0);

begin

if RSTn='0' then TXT_TIMER := ( others => '0');

TXT_ERROR <= '0';

elsif CLK'event and CLK='1' then

TXT_ERROR <= '0';

if TXT_TXBUSY='1' then

TXT_TIMER := TXT_TIMER + 1;

else

TXT_TIMER := ( others => '0');

end if;

if TXT_TIMER(14 downto 8) = HWTIMVALUE then

TXT_ERROR <= '1';

end if;

end if;

end process;

Page 20: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 28

Babbling Transmitter Testing

How do you test this ? Protocol State machines do not do this in normal operation

Create test mode input - TESTTXTTOUT Modifies the protocol state machine When high, causes >32 data words to be transmitted Test benches set this and verify that the core detects the

babbling transmitter Allows testing, but does this create an additional failure

mechanism ? May be pulled inactive by an external resistor, if this was to fail

then the core would fail

External Input can be disabled Can remove logic from core to prevent this error condition Synthesis will remove the error injection logic.

Page 21: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 29

Another considerationOver Sampling

Some systems can be improved by over-sampling input streams Then filtering or voting

1553B Already has well protected data stream

Manchester coding “00” and “11” patterns are error conditions

Parity on data words

Core1553BRT Samples incoming data at 6X, 8X 10X or 12X the base 2MHZ rate Required for clock extraction and ability to handle 1553B jitter and noise

requirements

Additional over sampling is not implemented at present because As is, Core1553BRT passes all requirements required by the 1553B RT test Would require higher speed clocks

Higher power consumption Larger device

Would require a major redesign Adds additional risk with a major redesign

Page 22: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 30

RTCA/DO-254Design Assurance Guidance for Electronic HW

Advisory Circular 20-152 Ratified 6/30/05, calls for DO-254 compliance for design assurance

levels A, B or C DO-254 standard originally developed in 2000

DO-254 is a hardware standard, IP is hardware There are many misunderstandings about this standard So far, there is no precedence for DO-254 certified IP We are focusing on section 10 by considering to provide Hardware

Design Life Cycle Data for relevant cores

What does it require? A DO-254 development flow in addition to the ISO-certified flow More documentation It forces discipline to follow a test plan and document against that plan

PHAC and HAS are important elements Without this, customers treat our IP as COTS products (section 11)

Page 23: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 32

Lessons Learned High quality = attention to detail

You cannot do too much verification for IP in severe environments We found a bug increasing code coverage from 98% to 100% Have gate reviews backed with data

Document variations from perfect For example, if code coverage is 99%, understand why

Experience matters Design Products Customers

There needs to be a way to add objectivity to verification Against a tester By a third party Have another person review the code or perform verification

You can always improve Core originally tested at multiple speeds, but not multiple languages DO-254 adds additional discipline to the development process

Page 24: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 33

Conclusion

Pre-built and verified IP can reduce risk, if A structured, robust development process is followed

Phase-gate process, even if simplified Additional concerns for severe environments are considered

Safe state machines Redundant check for babbling

Verification and validation is demonstrated Code coverage near 100% Certification of demonstration board design

Deliverables and documentation ease use Helps integration and design re-use

Many customers prove the core in a variety of environments More than one company can do on its own

Page 25: FPGA IP Verification for Use in Severe Environments 2005 MAPLD International Conference September 2005 Paper #237 Ian Land Ian Bryant

MAPLD 2005/237Land 34

ConclusionBlock-based Design Enables Development

PCI bus to instrument panel

1553 bus to rest of craft

ASM51 MCU(8051)

SensorModule

RemoteMonitor

SerialChannel

Prog.I/O

Synchronous Serial Channel (SDLC)

Asynchronous Serial Channel (UART)

Memory Data Bus

Shared Memory(on or off-chip)

PCI 1553 RT

Avionics Control PortData Transfer Port

Special Function Register Bus

Spacecraft I/O Board Example