

Document No D-PL-REP-4966-SE Date Issue Page 31 May 2002 1 1/17

PROJECT **ESA\_QCA0201S\_C** 

# TITLE Radiation Pre-Evaluation of Mitigation Techniques for Xilinx Virtex FPGA using Internal Voters

|                                                                                  | The wo<br>contrac                                      | EUROPEAN<br>CONTR<br>rk described in th<br>ct. Responsibility<br>author or organi | N SPACE A<br>ACT REPO<br>his report v<br>for the con<br>isation that | AGENCY<br>ORT<br>vas done under ESA<br>ntents resides in the<br>prepared it. |      |                                                        |           |
|----------------------------------------------------------------------------------|--------------------------------------------------------|-----------------------------------------------------------------------------------|----------------------------------------------------------------------|------------------------------------------------------------------------------|------|--------------------------------------------------------|-----------|
|                                                                                  | <u>Name</u>                                            |                                                                                   | Function                                                             |                                                                              | Date | <u>:</u>                                               | Signature |
| Prepared :                                                                       | Fredrik Stur<br>Saab Ericsso                           | esson–<br>on Space                                                                |                                                                      |                                                                              |      |                                                        | Falik S   |
| Approved:                                                                        | Stanley Mat<br>Saab Ericsso<br>Reno Harbo<br>ESA/ESTEC | tsson –<br>on Space<br>e Sorensen                                                 | Technical Of                                                         | ficer                                                                        |      |                                                        | \$-freet- |
| Distribution<br>Complete :<br>Summary :                                          |                                                        |                                                                                   |                                                                      |                                                                              |      |                                                        |           |
| Reg. Office:<br>Saab Ericsson S<br>S-405 15 Götebe<br>Sweden<br>Reg. No: 556134- | pace AB<br>org<br>2204                                 | Telephone:<br>+46 31 735 00 00<br>Telefax:<br>+46 31 735 40 00                    |                                                                      | Linköping Office:<br>Saab Ericsson Space AB<br>S-581 88 Linköping<br>Sweden  |      | Telephone:<br>+46 13 18 64<br>Telefax:<br>+46 13 13 16 | 00 28     |

| Document No : D-PL-REP-4966-SE | Date : 31 May 2002 | Issue : 1 | Page : 2 |
|--------------------------------|--------------------|-----------|----------|
|                                |                    |           |          |

Class : Contract No: Host System : Host File :

...\D-PL-REP-4966-SE.doc

Microsoft Word 97 for Windows, SE Macro Rev 3.0

# **SUMMARY**

This report presents the results from Cf-252 tests of a new test design implemented in Xilinx Virtex FPGA XQVR300. The new design, named TMR-Feedback, implements triple voted registers with voting circuits after each register cell and in the output stage of the FPGA. The mitigation techniques of Single Event Upsets in Virtex devices as Triple Module Redundancy and configuration readback (bitstream repair) have been developed by Xilinx.

The results give error rates in the magnitude of  $5 \cdot 10^{-3}$  cm<sup>2</sup>/device. This is far above what we would expect. It is obvious that the mitigation of the design has not been successful. The mitigation was implemented by Xilinx and the design has been sent back to Xilinx in a try to find out why it doesn't work as we expect it to do.

Further testing with heavy ions is not suggested before a corrected design can be implemented.

# DOCUMENT CHANGE RECORD

Changes between issues are marked with a left-bar.

| Issue | Date        | Paragraphs affected | Change information |
|-------|-------------|---------------------|--------------------|
| 1     | 31 May 2002 | All                 | New document       |

| Document N                                  | to : D-PL-REP-4966-SE                                                                                                               | Date : 31 May 2002                                           | Issue : 1 | Pag                              |
|---------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------|-----------|----------------------------------|
| TABLE                                       | OF CONTENTS                                                                                                                         |                                                              | I         | PAGE                             |
| 1.                                          | INTRODUCTION                                                                                                                        |                                                              |           | 4                                |
| 2.                                          | Virtex XQVR300 DET                                                                                                                  | ΓAILS                                                        |           | 5                                |
| 3.<br>3.1<br>3.2                            | TEST SAMPLES<br>DUT Design<br>TMR-Feedback Desi                                                                                     | gn                                                           |           | 6<br>7<br>7                      |
| 4.<br>4.1<br>4.2                            | TEST EQUIPMENT<br>General<br>Test Boards                                                                                            |                                                              |           | 10<br>10<br>11                   |
| 5.<br>5.1<br>5.1.1<br>5.1.2<br>5.1.3<br>5.2 | SEU TEST TECHNIQ<br>SEU Error Separation<br>Register Error Type<br>Configuration induc<br>SEU in Device Con<br>Other Test Considera | UES<br>n<br>rs<br>red Error types<br>trol Registers<br>tions |           | 13<br>13<br>13<br>13<br>14<br>14 |
| 6.                                          | SEE TEST RESULTS                                                                                                                    |                                                              |           | 15                               |
| 7.                                          | CONCLUSION                                                                                                                          |                                                              |           | 16                               |
| 8.                                          | REFERENCES                                                                                                                          |                                                              |           | 17                               |

Date : 31 May 2002

#### 1. INTRODUCTION

This report presents the results from Cf-252 tests of a new test design implemented in Xilinx *Virtex* FPGA XQVR300. The new design, named *TMR-Feedback*, implements triple voted registers with voting circuits after each register cell and in the output stage of the FPGA. The mitigation techniques of Single Event Upsets in *Virtex* devices as Triple Module Redundancy and configuration readback (bitstream repair) have been developed by Xilinx.

Saab has earlier SEU tested one non-mitigated design and one with voting circuits in the output stage of the FPGA [1]. The results for the mitigated design were promising. The aim with this new test design is to test the effectiveness of mitigation techniques using internal voters. In an application using e.g. counters or state machines, voting and correction of these elements may be necessary to assure that no build up of SEUs harm the redundancy of the design.

Page : 4

Date : 31 May 2002

#### 2. Virtex XQVR300 DETAILS

The Virtex FPGA is a SRAM based device fabricated on thin-epitaxial silicon wafers using the commercial mask set and the Xilinx 0.25µm CMOS process with 5 metal layers. SEU risks dominate in the use of this technology for most applications. In particular, the reprogrammable nature of the device presents a new sensitivity due to the configuration bitstream. The function of the device is determined when the bitstream is downloaded to the device. Changing the bitstream changes the design's function. While this provides the benefits of adaptability, it is also an upset risk. A device configuration upset may result in a functional upset. User logic can also upset in the same fashion as seen in fixed logic devices. These two upset domains are referred to as configuration upsets and user-logic upsets. Two features of the *Virtex* architecture can help overcome upset problems. The first is that the configuration bitstream can be read back from the part while in operation, allowing continuous monitoring for an upset in the set of the SEU correction.

Secondly, the high density and rich architecture allow resource redundancy to be economically implemented in order to filter out SEU effects.

Document No : D-PL-REP-4966-SE

Date : 31 May 2002

Issue : 1

# 3. TEST SAMPLES

All tests were performed on prototype devices delivered by Xilinx in a 240 pin plastic flat package. The samples used for Heavy Ion were delidded by etching the plastic from the topside down to the chip. Chip marking is shown below.

|                  | Chip Markings                                                        | Device Marking            |
|------------------|----------------------------------------------------------------------|---------------------------|
| SN #32<br>SN #34 | XILINX<br>C-Logo 1998 M-Logo<br>XNK06A<br>(All markings not visible) | Top and bottom side blank |

# Following information of process description is taken from a Total dose report presented by Xilinx at MAPLD2000 [2]:

| Material:                 | <1-0-0> 200hm-cm p-type epitaxial layer on highly doped |
|---------------------------|---------------------------------------------------------|
|                           | substrate                                               |
| Gate Oxide:               | SiO <sub>2</sub> , nominal 45/65A                       |
| Gate Width:               | 0.25/0.35µM defined                                     |
| Isolation:                | Shallow Trench 7,500A nom                               |
| Foundry:                  | UMC Group                                               |
| <i>Operating voltage:</i> | 2.5V                                                    |



*Figure 3.1 Overview of Virtex XQVR300 chip. Chip size is 11,2 x 11,2 mm.* 

Date : 31 May 2002

### 3.1 DUT Design

One new design has been developed and tested, named *TMR-Feedback*. It is a more complex version of the *TMR* design used in earlier heavy ion tests performed by Saab Ericsson Space[1]. The *TMR* design was implemented with 14 pipelined shift registers each 144 bit long and a self-test circuit. All 14 modules of 144 long shift register were tripled and majority voted in the output stage of the FPGA (Fig 3.1). In this new *TMR-feedback* design, triple voting circuits also have been implemented after each register cell inside the shift registers.



Figure 3.1 Principal Drawing of the TMR DUT Design.

### 3.2 TMR-Feedback Design

The *TMR-feedback* design implements into the Device Under Test (DUT) 17 pipelined shift registers each 68 bit long and a small *SelfTest* circuit.

The extension feedback for the *TMR-feedback* design comes from the fact that after each voting stage the signal is feedback into the registers. Individual register bits in the shift registers are build up of a D-type *CLB* flip-flop modules [3] with feedback signals. The *LoadEn* signal (see Fig 3.2) decides if the data is shifted through the shift registers or if the registers is updated with the voted feedback signal. To achieve SEU hardness, each register bit is tripled with tripled voting circuits (Fig. 3.3). Five of the shift registers use the *BUFT* resources as voting circuit and 12 shift registers use *LUT* resources as voting circuit is implemented after the instructions in Xilinx's application note, XAPP197 [4].



*Figure 3.2 Figure shows a triple redundant register (i) with feedback signal. With SEU free configuration data, this would be a full redundant solution.* 



Figure 3.3 Figure shows the triple redundant register (i) with feedback signal that have been used in the TMR-feedback design. The SEU sensitivity of the configuration data forces us to triple all signals, the voter and the Mux. The voting circuit have been implemented with either BUFT or LUT elements in the Virtex architecture.

| Document No : D-PL-REP-4966-SE | Date : 31 May 2002 | Issue : 1 | Page : 9 |
|--------------------------------|--------------------|-----------|----------|
|                                |                    |           |          |

The principal of the self test circuit, shown in Fig. 3.4, is that data are compared with itself and any mismatch is reported to an output (Error flag). Data are a 6-bit word taking two different paths in the design before comparison. One path goes through 6 I/O modules of the device and then back to the comparator. The other path goes directly to the comparator. The data are generated by a feed back flip-flop register from an external clock signal. This give toggling data with half the frequency of the clock signal. The *Self*-test module is identical to the one in the TMR design in the earlier test [1].

All together, 58 % of the available *CLB* flip-flop resources in the Virtex device have been used. Over 95 % of the available *CLB LUT* and *BUFT* resources have been used.

The *TMR-feedback* design uses the Triple Module Redundancy design techniques that Xilinx described by Xilinx in application note XAPP197 [4]. Compared to the *TMR* design this new design also includes internal voting circuits. The aim with this design is to verify if the internal voting circuits are SEU hard.



Figure 3.4 Principal Drawing of SelfTest Module

Date : 31 May 2002

Issue : 1

#### 4. TEST EQUIPMENT

At Saab Ericsson Space special test equipment have been developed for SEU- and Total Dose testing of FPGAs. In total dose tests, the DUTs are individually biased and supply current measurement and functional tests can be performed continuously.

# 4.1 General

The general concept is to load data into the DUT, pause for a pre-set time and thereafter read data and check for errors. New data are loaded into the DUT at the same time as old are read out. All this is repeated continuously during irradiation. With long pause time the DUT is tested in static condition and by setting the pause time to zero the DUT is tested in dynamic condition.

A flow chart of the test sequence is given in Fig. 4.1. Any detected errors will be stored in FIFOs, and the DUT will be loaded with new data again. The cycle will then be repeated. Failing read/write operations from/to the DUT will determine the functionality. The clock speed is variable up to 5 MHz. Error Data are serially transferred from the FIFO to a PC where data are analyzed. For each DUT, errors can be traced down to logic module, logic value and position.



Figure 4.1 Flow chart of the test sequence.

Document No : D-PL-REP-4966-SE

Date : 31 May 2002

#### Page: 11

### 4.2 Test Boards

The test system consist of two boards, one Controller board managing the test sequence and the serial interface to the PC and one DUT board housing two Devices Under Test (DUT). A principal drawing is given in Fig. 4.2.

The Controller board tests one DUT at a time using a "virtual golden chip" test method. The principal of the measuring technique is to compare each output from the DUT with the correct data stored in SRAM's. The general concept of the error detection and test sequence is shown in Fig. 4.1. The DUT is continually cycled while the outputs of selected ring counters are compared with the "golden chip". When an error is detected (when outputs do not match), the state of all outputs and position in cycle of the failing ring counter will be temporarily stored in FIFOs. Data in the FIFOs is continually send to a PC through a RS232 serial interface. After each test run the data are analyzed and stored in a database by the controlling PC.

The controller board also control the power supply for the DUTs and send status signals to a Data Logger connected to the board.



Figure 4.2 Principal drawing of DUT board and Controller Board

Document No : D-PL-REP-4966-SE

Date : 31 May 2002

Issue : 1



*Figure 4.3 Schematic drawing of DUT board with configuration interface for the Virtex device.* 

The configuration controller chip on the DUT-board is controlling the PROM and configuration ports of the DUT. A program command can be sent to the DUT, which clears its configuration memory and starts an automatic re-configuration of the DUT from the PROM. During the test of the DUT the configuration controller is continuously scrubbing the DUT configuration memory with new configuration data from the PROM's.

All data from the PROM's to the DUT is transferred through the parallel SelectMAP interface, which supports the partial configuration feature making it possible to continuously scrub the device with new configuration data during operation. The controller board also controls the power supply for the DUT by relays and sends status signals to a Data Logger connected to the board.

Date : 31 May 2002

Issue : 1

# 5. SEU TEST TECHNIQUES

# 5.1 SEU Error Separation

Detected errors out from the DUT could originate from SEU in registers (user logic flipflop) of the device, in the configuration data causing functional errors in parts of the device and in control registers of the device causing global functional errors. The analysed data errors are separated into three different domains, SEU in registers, SEU in configuration data, and SEU in device control registers.

### 5.1.1 Register Error Types

SEU in the user logic registers are corrected with new data loaded into the registers in connection with each read cycle. The data are analysed for single bit errors and categorised into the following error types:

- FF(0-1) Read '1' from flip-flop registers when '0' is expected.
- FF(1-0) Read '0' from flip-flop registers when '1' is expected.
- *FF* Total sum of all FF errors (above) read from the shift registers.
- DataSwap This error type showed up as two bit errors in registers next to each other. First a '0' was read when '1' was expected and in the next register a '1' is read when a '0' was expected. It is only observed in this order. The error was not persistent in the next test cycle. This error type stands for 25% of all user logic register errors. No explanation has been found for this error type.

### 5.1.2 Configuration induced Error types

SEU in the configuration data will remain until the configuration data are corrected with new configuration data. Errors that are caused by SEU in the configuration are quantified by observing the following signatures in the test data:

- Routing A SEU in the configuration logic (routing bits and lookup tables) may cause errors in the configured function of the operational device. This gives errors from the shift registers that are permanent until next time the device is scrubbed with new configuration data.
   Persistent A persistent error is a permanent error that can not be corrected with new configuration data. The device needs to be reset and completely reinitialised. This is the result of SEU in "weak keeper" circuits used in the Virtex architecture when logical constants are implied in the configured design such as unused clock enable signals for registers.
   SelfTest SelfTest errors are of the same type as the routing type, but instead of
- SelfTest SelfTest errors are of the same type as the routing type, but instead of interrupting a shift register it interrupts the function of the SelfTest module.

Date : 31 May 2002

Issue : 1

Page: 14

# 5.1.3 SEU in Device Control Registers

*SEFI type* Function of the whole device is interrupted in one hit and all shift register data are lost. The device requires a reset and complete reconfiguration for correction. Xilinx believe this to be SEU in the POR register in the control registers of the architecture.

# 5.2 Other Test Considerations

The test system is optimised for SEU testing of the registers. Data are clocked in and out and then paused for a pre-set time, giving the radiation time to upset the registers before reading out the data. Every upset in the tested registers (*FF*, *DataSwap*) will be detected.

A SEU in configuration data causing a functional error is corrected when new configuration data are written to the DUT. To be able to detect all of these errors the DUT must be continuously tested. Since the DUT is paused in our tests we will not see all of these errors (*Routing, SelfTest*). The results have not been corrected for this.

Date : 31 May 2002

Issue : 1

### 6. SEE TEST RESULTS

Altogether 14 test runs with Cf-252 source have been performed. All tests show very high error rates. Both devices have been tested showing similar results.

Due to the high error rates no error separation has been performed. The error data have only been briefly analysed. Except for "persistent" errors, all type of errors have been observed in both the shift registers using BUFT and LUT in the voting circuits.

The error rate varied very much in time and between different test runs. The phenomenon is well illustrated in Fig. 6.1. In 3 periods during that test run were the error rate about  $5 \cdot 10^{-3}$  cm<sup>2</sup>/device, while the average error rate for the whole test run was  $3 \cdot 10^{-4}$  cm<sup>2</sup>/device.

The cross-section for any type of errors has been recorded in all test runs to be between  $5 \cdot 10^{-3}$  and  $1 \cdot 10^{-6}$  cm<sup>2</sup>/device. The higher error rates are in the magnitude of what you could expect for a non-mitigated design.

Cf 252 test Sn#34 RunNo 6 (28.1) 2001-11-13



*Figure 6.1* Accumulated number of errors vs. accumulated fluence during one test run with Cf-252. The flux was 40 ions/cm<sup>2</sup>/s. The POR-marks indicates when the device has been reseted with the PROG-signal.

Date : 31 May 2002

#### Page : 16

# 7. CONCLUSION

With a successive mitigation method we would only expect SEFI upsets. From earlier heavy ion test we know that the cross-section for SEFI upsets is very small,  $1 \cdot 10^{-5}$  cm<sup>2</sup>/device. The results in this report give error rates in the magnitude of  $5 \cdot 10^{-3}$  cm<sup>2</sup>/device. This is far above what we would expect from SEFI upsets. It is obvious that the mitigation of the design has not been successful. The mitigation was implemented by Xilinx and the design has been sent back to Xilinx in a try to find out why it doesn't work as we expect it to do.

Further testing with heavy ions is not suggested before a corrected design can be implemented.

Date : 31 May 2002

Page : 17

#### 8. **REFERENCES**

- [1] ESA\_QCA0109TS\_C, Radiation Pre-Evaluation of Xilinx FPGA XQVR300, Saab Ericsson Space (D-P-REP-1091-SE), Aug 2001.
- [2] Joe Fabula, Howard Bogrow, *Total Ionizing Dose Performance of SRAM-based FPGAs and supporting PROMs*, MAPLD 2000, October 2000.
- [3] *Virtex*<sup>™</sup> 2.5 *V Field Programmable Gate Arrays*, Product Specification, DS003-2 (v2.6) July 19, 2001
- [4] Xilinx Application Note XAPP197(v1.0), Triple Module Redundancy Design Techniques for Virtex FPGAs, Nov 2001.