

Giltighet begränsad till Validity restricted to

Dokument ID Document ID

| D-P-REP-012/2-SE                 |                       |           |
|----------------------------------|-----------------------|-----------|
| Frisläppt datum Date Released    | Utgåva <i>Issue</i>   | Sida Page |
| 2004-11-03                       | 1                     | 1(28)     |
| Informationsklass Classification | Dok.status Doc.Status |           |
| Öppen                            | Frisläppt             |           |

Fördelning Distribution

Alt. Dokument ID Al*t. Document ID* ESA\_QCA0415S\_C

Utgåva *Issue* 

# PROJEKT *PROJECT* ESA\_QCA0415S\_C

# TITEL TITLE Application-like Radiation Test of XTMR and FTMR Mitigation Techniques for Xilinx Virtex-II FPGA

### EUROPEAN SPACE AGENCY CONTRACT REPORT

The work described in this report was done under ESA Contract. Responsibility for the contents resides in the author or organisation that prepared it.

Utfärdat av *Issued by* Fredrik Sturesson

Godkänt av *Approved by* Stanley Mattsson

Reno Harboe Sørensen -ESA / ESTEC Funktion Function

Funktion Function

Datum *Date* 2004-11-03 2004-11-03

TOC-QCA Techincal Officer

MS Word 97 – S-ASTD-NOT-00016-SE, Issue 3 – Report

### Saab Ericsson Space AB

Postadress *Postal address* SE-405 15 Göteborg Sweden Telefon *Telephone* +46 (0)31 735 00 00

Telefax *Telefax* +46 (0)31 735 40 00 Organisationsnummer. *Registered number* 556134-2204 Momsreg.-nr *VAT No* SE556134220401

**DOCUMENT CHANGE RECORD** 

Dokument ID Document ID D-P-REP-01272-SE

Frisläppt datum Date Released Utgåva Issue 2004-11-03

1

Informationsklass Classification Öppen

### SUMMARY

# Changes between issues are marked with a outside-bar.

Paragraphs affected Issue Date Change information 1 All New document

Frisläppt datum *Date Released* 2004-11-03

ased Utgåva Issue 1

Informationsklass *Classification* Öppen

### **TABLE OF CONTENTS**

### INNEHÅLLSFÖRTECKNING

| 1       | SCOPE 4               |
|---------|-----------------------|
| 2       | INTRODUCTION          |
| 2.1     | Abbreviations 4       |
| 3       | TEST SYSTEM 5         |
| 3.1     | DUT Board6            |
| 3.1.1   | Service FPGA          |
| 3.2     | SEE/TID test system7  |
| 3.3     | SEE Error Separation7 |
| 3.4     | Irradiation Facility  |
| 3.5     | Test Samples          |
| 4       | DUT DESIGN11          |
| 4.1     | FFT-module11          |
| 4.2     | PSR-module11          |
| 4.3     | Design variants       |
| 4.3.1   | Non-TMR               |
| 4.3.2   | FTMR                  |
| 4.3.3   | XTMR                  |
| 4.3.3.1 | XTMR_V114             |
| 4.3.3.2 | XTMR_V2               |
| 4.4     | Resources usage       |
| 5       | RESULTS               |
| 5.1     | XTMR Methology        |
| 6       | DISCUSSIONS           |
| 7       | CONCLUSION            |
| 8       | REFERENCES            |
| А       | APPENDIX              |

# PAGE

SIDA

Frisläppt datum Date Released U 2004-11-03 1

Utgåva *Issue* Informationsklass *Classification* 1 Öppen

### 1 SCOPE

This report presents the results from heavy ion and proton testing of Xilinx Virtex II FPGA using FTMR and XTMR methology of Triple Modular Redundancy. The tests have been performed on an application like design with continuous scrubbing of configuration data.

### 2 INTRODUCTION

SRAM based FPGAs as Xilinx Virtex II is dependent on mitigation techniques to overcome upsets in configuration and user logic. Correction of upsets in configuration logic can be achieved with read back of configuration logic and partial re-configuration or continuously scrubbing new configuration data into the device. Some mitigation techniques must also be implemented on user logic level to mitigate all temporary errors that exist before an upset in configuration logic has been corrected. Also upsets in user logic flip-flop need to be mitigated. A dedicated task for any mitigation techniques is to prevent any errors to accumulate within the design.

### 2.1 Abbreviations

| Care bits | Estimated number of configuration bits that a design is dependent on |
|-----------|----------------------------------------------------------------------|
| FFT       | Fast Fourier Transform                                               |
| FTMR      | Functional Triple Modular Redundancy                                 |
| MER       | Mitigation Error Rate                                                |
| PSR       | Pseudo Randomiser                                                    |
| SE        | Saab Ericsson Space AB                                               |
| SEFI      | Single Event Functional Interrupt                                    |
| SEE       | Single Event Effect                                                  |
| SEU       | Single Event Upset                                                   |
| SRAM      | Static Random Access Memory                                          |
| TID       | Total Ionising Dose                                                  |
| TMR       | Triple Modular Redundancy                                            |
| XTMR      | TMR methology developed by Xilinx for Xilinx FPGAs                   |

| Saab Ericsson Space AB                             |                                                 |                          |                                               |                       |  |  |
|----------------------------------------------------|-------------------------------------------------|--------------------------|-----------------------------------------------|-----------------------|--|--|
| Dokument ID <i>Document ID</i><br>D-P-REP-01272-SE | Frisläppt datum <i>Date Released</i> 2004-11-03 | Utgåva <i>Issue</i><br>1 | Informationsklass <i>Classification</i> Öppen | Sida <i>Page</i><br>5 |  |  |

### 3 TEST SYSTEM

The test system that has been used consists of one DUT board with a *Service FPGA*, LVDS adapters and one PC with I/O boards (*Figure 3-1*).



Figure 3-1 Principal Drawing of Test System for Xilinx Virtex II.

### 3.1 DUT Board

Saab Ericsson Space AB

Xilinx's Virtex II prototype board, HW-AFX-FG676, has been used as DUT board. The board house one test socket for one DUT, one *Service FPGA*, configuration PROM and power distribution.

Connectors for test signals to the DUT and an UART interface have been added to the prototype board. The content of the *Service FPGA* has been modified to manage remote control from PC and the scrubbing of DUT.

### 3.1.1 Service FPGA

All scrubbing of the DUT is handled from the *Service FPGA* with an IP core written by Xilinx (**Figure 3-2**). The algorithm continuously scrubs the configuration data of the DUT and detects SelectMAP-type and POR-type SEFIs [7].

The scrubbing of the DUT is performed through the SelectMAP interface at 10 Mhz.

Peripheral functions have been added to be able to remote control the *Service FPGA* from PC and read out status signals.



- Configure target FPGA with configuration data stored in the configuration PROM(s).
- Read back configuration programming data from target FPGA and calculate 16 bit CRC. Store CRC value as "Config-CRC".
- Perform a Write/Read check on the internal Frame Address Register of target FPGA.
- Scrub (background refresh) configuration data of target FPGA.
- Read back configuration programming data from target FPGA and calculate 16 bit CRC. Store CRC value as "Rdbk-CRC.
- Compare "RDBK CRC" with "Config-CRC
- If CRC values mismatch a second time then assert SEFI\_ERROR and RECONFIGURE

Figure 3-2 CRC error detection and correction algorithm within the Service FPGA.

2004-11-03

Frisläppt datum Date Released

Informationsklass *Classification* Öppen

Sida *Page* 7

### 3.2 SEE/TID test system

In earlier contract with ESA/ESTEC [5] SE has developed an SEE/TID test system for digital logic. This test system has been upgraded. The test principal is the same but the number of I/O-channels and speed have been increased. The test engine has been moved from the test chamber to a PC with an on the shelf PCI digital I/O board. The PCI-board generates and read all test signals that are transferred into test chamber with LVDS signals.

Utgåva Issue

1

The SEE/TID test system uses a "virtual golden chip" test method (*Figure 3-3*). The *Data Pattern Generator* continuously streams input test vectors into the DUT. Up to 32 test signals and one synchronous test clock can be used with a data depth up to 128 MSamples. The *Comparator* synchronously read the outputs from the DUT. Up to 64 signals can be read. The principal of the measuring technique is to compare each output from the DUT with the correct data stored in computer memory (*Virtual Golden Chip*). When an error is detected (when read data from DUT do not match with *Virtual Golden Chip*), the state of all outputs and the clock and test cycle are stored on hard disc. The data is analyzed after each test run.

Main parameters for the new SEE/TID test system:

- Test can run with data flow of up to 10 Mbyte/s.
- Clock speed from 0,5 to 80 MHz.
- Up to 96 test signals
- The system is module based. It will be easy to upgrade and expand the test system in the future.



Figure 3-3 Data flow of SEE/TID Test System

### 3.3 SEE Error Separation

Detected errors out from the DUT could originate from SEU in registers (user logic flipflop), in the configuration data causing functional errors in parts of the device or in control registers causing Single Event Functional Interrupts. The mitigation techniques shall take care of all these errors, except when a SEFI error occurs. The analysed error data are separated into three different domains, *Data Error*, *Stuck Error*, and *SEFI*.

| Data Error  | When erroneous data is read from any output from a module and<br>the error persist for less than one scrub cycle. The error has been<br>corrected with the next configuration bit stream.                                                                                                                                                                                                  |
|-------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Stuck Error | When erroneous data is read from any output from a module and<br>the error persist for more than one scrub cycle. The error has not<br>been corrected with the next configuration bit stream.                                                                                                                                                                                              |
| SEFI        | When erroneous data is read from most outputs of the device for<br>more than one scrub cycle. The error has not been corrected with<br>the next configuration bit stream. The function is recovered by<br>reset the device. Most SEFI types are detected by the <i>Service</i><br><i>FPGA</i> . In these cases the <i>Service FPGA</i> reset the DUT<br>autonomously and the test goes on. |

Data Error and Stuck Error are separately counted for each module of the DUT.

### 3.4 Irradiation Facility

Radiation tests have been performed on two different facilities, RADEF at the accelerator laboratory of the University of Jyväskylä in Finland (JYFL) and HIF at Université Catholique de Louvain (UCL) in Belgium.

HIF was used for the first test campaign and RADEF used for the second. In the tests at HIF 40-Ar (LET=14.1 MeV/mg/cm2), 84-Kr (LET=34 MeV/mg/cm2) and 132-Xe (LET=55.9 MeV/mg/cm2) were used. At RADEF, irradiation tests were performed with Si-ion of LET=11 MeV/mg/cm2 and 60 MeV protons.

| Saab Ericsson Space AB  |                                                 |                     |                                         |                  |  |  |
|-------------------------|-------------------------------------------------|---------------------|-----------------------------------------|------------------|--|--|
| Dokument ID Document ID | Frisläppt datum <i>Date Released</i> 2004-11-03 | Utgåva <i>Issue</i> | Informationsklass <i>Classification</i> | Sida <i>Page</i> |  |  |
| D-P-REP-01272-SE        |                                                 | 1                   | Öppen                                   | 9                |  |  |

### 3.5 Test Samples

The device chosen for this study is the Virtex II XQR2V3000. It is fabricated on a  $0.15\mu m / 0.12\mu m$  CMOS 8-layer metal process with epitaxial layer. The mask set should be identical to the commercial variant X2V3000. The epitaxial layer is not expected to change upset cross-sections very much. It is used in the XQR-line to eliminate single-event latch up.

All heavy ion tests were performed on prototype devices delivered by Xilinx in 676 plastic flat packages. The samples were chemically etched on the top to expose the die. The samples were etched by Xilinx before delivery. Chip marking is shown below.

### **Chip Markings**

| SN #E1 | XILINX               |
|--------|----------------------|
| SN #E2 | C-Logo 200111 M-Logo |
| SN #E3 | X7970                |
| SN #E4 | =L12                 |
| SN #E5 | =L2A                 |
|        | A008                 |

*Figure 3-4* Overview of SN#E1 chip. Chip size is approximately 16 x 16 mm.

The yield from the chemical etching was very poor. It was difficult to stop the etching process from affect the die itself. The attendant phenomenon was that the metallization on the bond pads were damaged (*Figure 3-5*). Some bond lifts were observed on some samples and many bond pads were partly damaged. None of the 5 etched samples were 100% reliable but at least considered functional.

Connectivity tests for all I/Os were performed in connection to the radiation tests. One drawback was that the test chamber environment with vacuum seemed to accelerate the problems.



*Figure 3-5* View of bond pad on SN#E1 where the chemical etching has affected the metallization.

Proton tests were performed on one prototype device and one commercial device. The mask set should be the same for the two devices. The epitaxial layer on the prototype device is not expected to change the upset cross-sections very much. The commercial device was not etched. Device marking is shown below.

### Marking /Top side

SN #C1 Xilinx-Logo XILINX R-Logo XC2V3000 FG676AGT0317 F2135456A 4C Virtex-Logo PHILLIPINES

#### **DUT DESIGN** 4

The complete DUT design consists of two copies of a FFT module and two copies of a PSR module. Each of the four modules in the design works independently of each others. The only common resources are the clock net and reset net.

1

#### 4.1 **FFT-module**

To achieve high proportion of both combinatorial and sequential logic, a fast Fourier transform (FFT) circuit has been chosen. Two independently working modules of the FFT-design are implemented in the DUT.

Benefits with this test design are that most of the used logic for the design is operational almost all the time. This means that almost any upset in the design will affect the read out from the device. It is only the I/O logic that is used only a short time of the test cycle (less than 5% of test cycle time). The I/Os are not considered to be well tested.

This is the flow for the FFT-module (see *Figure 4-1*):

- 1. The in data shift register is loaded from *DataIn* by *DataInEn* inputs.
- 2. The Randomizer generates randomized data from the In Data Shift Register and stores it in Data Matrix.
- 3. A Fourier Transform is performed on the Data Matrix. The result is rewritten to the Data Matrix.
- 4. The sum of each result from the Fourier transform is added to the Sum of Fourier Register.
- 5. Step 2 and 4 is repeated until FFT has been performed on 1024 samples.
- 6. When all Fourier transforms have been performed, the Sum of Fourier Register is shifted out on DataOut output with DataOutEn asserted.

The module uses about 1700 clock cycles from that data are shifted in until the results are shifted out.



Figure 4-1 Data flow of FFT module.

#### **PSR-module** 4.2

The second module included in the DUT design is the PSR-module. With 60 serially connected Pseudo Randomizers have the number of used flip-flops in the DUT design been increased. The Pseudo Randomizer is taken from Ref. [1].

Here follows the VHDL-description for one Pseudo Randomizer (clock process excluded).

Each Pseudo Randomizer has 4 inputs, 4 outputs and 12 flip-flops.

One un-mitigated PSR module with 60 Pseudo Randomizers in serial uses in total 720 user flip-flops.

### 4.3 Design variants

The FTMR and the XTMR methology are mitigation techniques that use triplication of user logic flip flops and combinatorial logic including additional voting circuits.

The DUT design has been implemented in 4 different variants, one un-mitigated (Non-TMR), one mitigated with FTMR methology (FTMR) and two mitigated with Xilinx's TMR Tool (XTMR\_V1 and XTMR\_V2). All variants are functionally the same.

A special VHDL syntax is used to be able to easily select level of mitigation for a design [1]. The four designs were generated with different generics options in the VHDL code and different synthesis and place and route flow.

In the mitigated variants all inputs and outputs are tripled. For mitigation of outputs Xilinx has introduced a solution [3] that takes away the need of external logics to vote the triple outputs (*Figure 4-2*).

These output voters can be added within Xilinx TMR Tool. They are not included in the FTMR Methology. Here the voters have been added to the VHDL code.

- Outputs can be triplicated, using three pins for each output signal.
- Minority voters monitor each of the triplicated design modules.
- If one module is different from the others, its output pin is driven to High-Z
- Voters are triplicated



*Figure 4-2* Functional description of Xilinx minority output voters.

### 4.3.1 Non-TMR

This is the basic DUT design for heavy ion test without any mitigation.

### 4.3.2 FTMR

The FTMR methology approach is to perform all mitigation in VHDL code [1]. The methology can be used for any FPGA or ASIC vendor and is not written specially for Xilinx FPGAs.

The FTMR variant is a mitigated variant using all mitigation options available in the FTMR methology. All routing logics and registers are tripled and internal voters are used before and after each register in the design.

When analyzing the edif-file from the place and route tool it was concluded that FTMR methology didn't mitigate all logic in the design. Xilinx place and route tool introduced new nets that are not tripled. These nets are used to control static inputs to the basic elements used in the Xilinx Virtex II architecture.

### 4.3.3 XTMR

The Xilinx TMR Tool is a graphical software application that automates the implementation of mitigation of Xilinx FPGA designs [2]. Designs are imported from a Xilinx netlist and exported as a single standard EDIF project source file (*Figure 4-3*).

All routing and logic resources shall be mitigated using the Xilinx TMR Tool. Internal voters are added to prevent error accumulation in registers and static inputs to the basic elements in the Xilinx Virtex II architecture are tripled and all inputs and outputs are tripled.



Figure 4-3 Design flow with Xilinx TMR Tool.

### 4.3.3.1 XTMR\_V1

XTMR\_V1 has been generated by Xilinx taking the non-TMR variant into the Xilinx TMR Tool. All routing, user logic, inputs and outputs are triplicated using XTMR methology.

The PSR module of the XTMR\_V1 has been analyzed by studying the edif-file generated by Xilinx TMR Tool. It has been concluded that only 8% of the registers in the module has been implemented with internal voters. Most logic of the module is only voted on the outputs. The reason is that Xilinx TMR Tool only inserts voting circuits for registers with feedback logic. The PSR module mainly consists of feed through logic.

In the FFT module has 95% of all registers internal voters. Most registers in the FFT module have feedback logic and hence the Xilinx TMR tool adds internal voting circuits to most of the registers.

### 4.3.3.2 XTMR\_V2

The PSR module in XTMR\_V1 variant has a low ratio of interval voting circuits. Most logic is only voted on the outputs. XTMR\_V2 is functionally identical to XTMR\_V1 but each register has an internal voter. XTMR\_V2 was implemented by Xilinx with the same flow as XTMR\_V1 but with a feedback loop to all registers and some preserve attributes did we force the Xilinx TMR Tool to introduce voters for 100% of the registers in both the FFT and the PSR module. The FFT module in the XTRM\_V1 and XTMR\_V2 only differ slightly while the PSR module in the XTMR\_V2 variant has much more internal voting circuits and overall uses more logic resources.

FTMR and XTMR mitigation techniques use majority voters and tripled logic to correct errors from SEUs. The triplets and the majority voting circuits build one TMR element. The definition of a TMR element is all tripled logic in between two voting circuits (or device input/output). The logic include flip-flops, combinatorial logic and configuration bits that the user logic are dependent on. Each triplet must be independent on the other triplets. Each TMR element ends with a voting circuit (internal or output) that mitigates errors in the triplets. The concept of TMR elements is illustrated in *Figure 4-4*.

When more than one of the triplets in the TMR-element is erroneous the voting mechanism is jeopardized and the failure can propagate. This phenomenon will hereafter be referred to as a mitigation error and the rate that this phenomenon occurs during testing is referred to as Mitigation Error Rate (MER).

Dokument ID *Document ID* D-P-REP-01272-SE ed Utgåva *Issue* Informati 1 Öppen

Informationsklass Classification

Sida Page 16



Figure 4-4 Illustrations of TMR elements in mitigation of a 2-stage shift register. The circuit (a) can be tripled with one single TMR element shown in (b). The TMR element then consists of triplets of inputs, triplets of 2-stage shift registers and an output voting circuit. With internal voting circuits can the circuit be split into two TMR elements (c). TMR element A consist of triplets of inputs, triplets of 1-stage shift registers and an internal voting circuit. TMR element B consists of triplets of 1-stage shift registers and an output voting circuit. The TMR elements also include all configuration bits corresponding to the user logic.

| Saab Elicsson Space AB  |                               |                     |                                  |           |  |  |
|-------------------------|-------------------------------|---------------------|----------------------------------|-----------|--|--|
| Dokument ID Document ID | Frisläppt datum Date Released | Utgåva <i>Issue</i> | Informationsklass Classification | Sida Page |  |  |
| D-P-REP-01272-SE        | 2004-11-03                    | 1                   | Öppen                            | 17        |  |  |

### 4.4 Resources usage

Cook Ericooon Choos AD

The DUT design variants use different amount of physical resources in the FPGA. A summary is given in Table 4-1. The Non-TMR variant has no triple logic and uses least resources. With 2 internal voters for each register the FTMR variant uses most resources. It uses about 6 times more logic resources in comparison with the Non-TMR variant. All internal voting circuits added by the Xilinx TMR tool are build with LUT resources. XTMR\_V2 has more internal voters than XTMR\_V1 and thereby uses more LUT resources.

|                       | Non-TMR                 | FTMR                     | XTMR_V1                  | XTMR_V2                  | Total available |
|-----------------------|-------------------------|--------------------------|--------------------------|--------------------------|-----------------|
| Number of FF          | 2140<br>(7%)            | 6420<br>(22%)            | 6420<br>(22%)            | 6420<br>(22%)            | 28672           |
| Number of LUT         | 4238<br>(15%)           | 27353<br>(95%)           | 14460<br>(50%)           | 23287<br>(81%)           | 28672           |
| Number of IOB         | 26 <sup>1</sup><br>(5%) | 78 <sup>1</sup><br>(16%) | 81 <sup>1</sup><br>(16%) | 81 <sup>1</sup><br>(16%) | 484             |
| Number of MULT18X18   | 24<br>(25%)             | 72<br>(75%)              | 72<br>(75%)              | 72<br>(75%)              | 96              |
| Number of GCLK        | 1<br>(6%)               | 3<br>(18%)               | 3<br>(18%)               | 3<br>(18%)               | 16              |
| Number of "Care Bits" | 508749<br>(5%)          | 3979277<br>(42%)         | 2075095<br>(22%)         | 2854971<br>(30%)         | 9582848         |

### Table 4-1 Resource usage for all DUT design variants

<sup>1</sup>XTMR\_V1 and XTMR\_V2 have 3 inputs for all the static nets in the design. FTMR and Non-TMR don't mitigate these nets.

Number of "Care bits" for all variants of the DUT design has been estimated with the SEUPI tool [4]. "Care bits" are the estimated number of configuration bits that the design is dependent on.

From Table 4-2 it may be concluded that the FFT module uses about twice the resources as the PSR module in the XTMR\_V1 variant. The PSR module is more flip-flop intensive. It can be assumed that PSR module use relative more resources in XTMR\_V2 and FTMR compared to XTMR\_V1. The PSR module in XTMR\_V1 has a very small portion of internal voters that is very resource consuming.

Frisläppt datum *Date Released* Utgåva *Iss* 2004-11-03 1

Utgåva *Issue* Information 1 Öppen

|                                    | One FFT<br>module | One PSR<br>module | Common for all modules | Total available |
|------------------------------------|-------------------|-------------------|------------------------|-----------------|
| Number of FF                       | 1050<br>(4%)      | 2160<br>(8%)      | 0                      | 28672           |
| Number of LUT                      | 5511<br>(19%)     | 1812<br>(6%)      | -                      | 28672           |
| Number of IOB                      | 12<br>(2,5%)      | 24<br>(5%)        | 9<br>(2%)              | 484             |
| Number of MULT18X18                | 36<br>(37%)       | 0<br>(0%)         | 0                      | 96              |
| Number of GCLK                     | -                 | -                 | 3<br>(18%)             | 16              |
| Number of "Care Bits" <sup>1</sup> | 719404<br>(8%)    | 318144<br>(3%)    | -                      | 9582848         |

### Table 4-2 Distribution of resources between modules for the XTMR\_V1 variant.

<sup>1</sup>The XTMR\_V1 variant has been split into two separate designs to be able to calculate the care bits for each module.

Frisläppt datum *Date Released* 2004-11-03

e Released Utgåva Issue Infor 1 Öpj

Informationsklass *Classification* Öppen

### 5 RESULTS

SEE tests have been performed on two occasions. Many questions were raised from the first campaign. The MER phenomenon was not obvious at that time. The non-mitigated nets in the FTMR variant introduced by the Xilinx Place&Route tool was identified in this first irradiation test campaign. Together with Xilinx we could not find any suitable way to adequately mitigate the FTMR variant, and therefore this variant was not considered for further tests.

After the first campaign the XTMR\_V2 variant was introduced and the test system was improved with decreased scrub time. The purpose of these actions was to decrease the amount of heavy ion flux dependent errors. All presented results are from the second campaign.

### 5.1 XTMR Methology

Heavy ion tests with Si (LET<sub>eff</sub> = 11 MeV/cm2/mg) and 60 MeV proton tests have been performed for the Non-TMR, XTMR\_V2 and XTMR\_V1 variants. All results from all test runs are summarized in appendix A. Only SEFI upsets and failures that were repaired within one scrub cycle (*Data Error*) were observed. This means that it was never observed that only a part of the device had a failure that couldn't be repaired by the scrubbing of the configuration bits (*Stuck Error*).

Tests on the XTMR\_V2 and XTMR\_V1 variants (run#108, #110, and #212) were performed with only two of the three clock nets running. In this way the mitigation is undermined and the module cross section for two of the three triplets without mitigation could be measured. The error cross section for one triplet of each module is presented in Table 5-1. The method has been verified to be representative against the Non-TMR variant (test run#114).

Using estimated the number of care bits in Table 4-2 and the module cross sections for XTMR\_V1 in Table 5-1 give a measured bit upset cross section of 5,4E-9 cm2/bit for the PSR module and 4,6E-9 cm2/bit for the FFT module.

|         |        | 8                                       |                                                    |
|---------|--------|-----------------------------------------|----------------------------------------------------|
|         | Module | Cross Section<br>Si<br>[ cm2 / module ] | Cross Section<br>60 MeV proton<br>[ cm2 / module ] |
| XTMR_V2 | FFT    | 1,4E-03                                 | 2,4E-09                                            |
|         | PSR    | 8,2E-04                                 | 1,2E-09                                            |
| XTMR_V1 | FFT    | 1,1E-03                                 |                                                    |
|         | PSR    | 5,3E-04                                 |                                                    |

# Table 5-1Error cross section for each triplet in XTMR\_V2 and XTMR\_V1<br/>design for SN#E1.

No flux dependence for SEFI upsets was observed for the two design variants. The FFT module indicates small improvements from XTMR\_V1 to XTMR\_V2 design (*Figure 5-1*). The PSR module show big improvements with reduced flux dependence from XTMR\_V1 to XTMR\_V2 design (*Figure 5-2*).

The flux dependency and the fact that the PSR module could be improved by splitting the design with voters are strong evidence that mitigation errors are the main contribution of errors in the test runs.

Mitigation errors are a test induced phenomenon that prohibits from finding out the true single upset cross section. Trend lines in *Figure 5-1* to *Figure 5-4* indicates that even near zero flux would give an error cross section near the same order of magnitude as SEFI upsets (about 1E-5 for Si and 1E-10 for protons).

Error cross section for protons differs only slightly between non-etched commercial device and the etched sample with epitaxial layer(*Figure 5-3* and *Figure 5-4*). This indicates that the etched *sample* is representative independent of problems with bond lifts induced by the etch process (see chapter 3.1).

The effectiveness of the mitigation is best judged in the proton tests. Here are the upset rates lowest and hereby the contribution of mitigation errors is least. Comparison of lowest flux in *Figure 5-3* with Table 5-1 indicates that the XTMR mitigation has reduced the error cross section with more than 40 times.



Figure 5-1 Cross section data for Si ion taken for SEFI and two FFT modules in the XTMR\_V2 and XTMR\_V1 variants. X-axis shows mean flux per scrub for each test run (4,88 scrubs/s). Linear trend line is for data points of XTMR\_V2 design only. Small differences in flux dependence between the two design variants can be observed. SEFI errors are not dependent on flux.

Si Heavy lon



Figure 5-2 Cross section data for Si ion taken for SEFI and two PSR modules with the XTMR\_V2 and XTMR\_V1 variants. X-axis shows mean flux per scrub for each test run (4,88 scrubs/s). Linear trend line for data points of XTMR\_V2 design. Large improvements with the XTMR\_V2 variant can be observed.

The results for XTMR\_V1 and XTMR\_V2 show a strong dependence on the particle flux (*Figure 5-1* to *Figure 5-4*), both for the FFT and PSR modules.



60 Mev Proton SN#E1 XQR2V3000



Figure 5-3 Proton cross section data taken for two FFT and two PSR module with the XTMR\_V2 variants on SN#E1 (etched XQR2V3000). X-axis shows mean flux per scrub for each test run (4,88 scrubs/s). Linear trend line is for data points of FFT module only. No SEFI upsets were observed.



Figure 5-4 Proton cross section data taken for FFT and PSR module with the XTMR\_V2 variant on SN#C1 (non-etched XC2V3000). X-axis shows mean flux per scrub for each test run (4,88 scrubs/s). Linear trend line is for data points of FFT module only. In all test runs was only one SEFI upset observed (not shown in the figure).

Frisläppt datum Date Released 2004-11-03

#### 6 DISCUSSIONS

All heavy ion tests were performed on prototype devices delivered by Xilinx in 676 plastic flat packages. The samples were chemically etched on the top to expose the die. The samples were etched by Xilinx before delivery.

1

In the chemical etching it has been difficult to stop the etching process from affect the die itself. Metallization on the bond pads have been damaged and bond lifts were observed on some samples, many bonds were partly damaged. The five etched samples were considered functional but not 100% reliable.

The functionality of the etched test samples were verified against a non-etched sample in proton test and found to give the same results, and therefore all presented test data in this report are considered valid.

Due to problems with test samples only a limited amount of test data has been taken. Tests with a larger range of LET values would be needed to fully judge the effectiveness of the XTMR mitigation technique on Xilinx Virtex II FPGAs.

SEFI upsets have been observed in accordance with earlier report [6-7]. No stuck bit errors have been observed, i.e. no error has been observed which could not be corrected with the next configuration bit stream.

Bit upset cross section based on non-TMR dynamic test for both FFT and PSR module indicate a cross section 5E-9 cm2/bit at a LET of 11 MeV/cm2/mg. The number of care bits for the tested modules has been estimated using SEUPI [4] tool.

A spread in error cross section between the two copies of the modules has been observed, both for the FFT module and the PSR module. The routing is implemented by the Xilinx Place & Route tool. The two copies would most likely not be implemented identically. How much the error cross section depends on these possible differences is hard to predict.

Frisläppt datum Date Released 2004-11-03

*leased* Utgåva *Issue* Informatio 1 Öppen

Informationsklass *Classification* Öppen

### 7 CONCLUSION

The flux dependent mitigation error phenomenon that has been observed in these tests is likely to be representative for all kind of applications. The over all of the final TMR design hardness for these mitigation errors is very much dependent on the design of the application software, particle flux and scrub rate.

In application in space environment it is essential to consider the rate of the configuration bits correction to prevent mitigation error phenomenon. Extrapolating the results presented in this report to particle flux in space environment, indicate that the probability for mitigation error would be in the same range as SEFI errors. No other error modes have been identified.

This report presents results from irradiation with one LET value and one proton energy. Tests with a wider range of LET values and higher proton energy would be needed to fully characterize the Xilinx Virtex II device for SEE.

The Xilinx Virtex II device is very complex with many different types of elements in the architecture. Examples of architectural elements are flip flops (FF), look up tables (LUT), multiplexers, IOs configurable to a wide range of IO standards, digital clock managers (DCM) and embedded RAM. The design that has been tested does not cover all of these types of elements. FF, LUT and multiplexers can be considered to be well tested. Only a small amount of IOs, configured as LVTTL, has been tested.

Frisläppt datum *Date Released* Utgåva *Issue* 2004-11-03 1

Informationsklass Classification
 Öppen

### 8 **REFERENCES**

[1] S Habinc Gaisler Research, "Functional Triple Modular Redundancy (FTMR)
 VHDL Design Methodology for Redundancy in Combinatorial and Sequential Logic",
 ESA contract No. 15102/01/NL/FM(SC) CCN-3, December 2002

[2] Xilinx TMR Tool, http://www.xilinx.com/products/milaero/tmr/index.htm

[3] C. Carmichael, "Triple Module Redundancy Design Techniques for Virtex FPGAs" Xilinx Application Note XAPP197, Nov. 2001.

[4] P. Sundararajan, C. Patterson, C. Carmichael, S. McMillan and B. Blodget, "Estimation of Single Event Upset Probability Impact of FPGA Designs," *MAPLD'03*, Washington DC, USA, 9-11 Sept. 2002.

[5] S. Mattsson, F. Sturesson Saab Ericsson Space "Radiation Pre-Evaluation of Actel FPGA RT54SX16 and A14100A", ESA contract No. ESA\_QCA9911TS\_C; Jan 2000.

[6] C. Yui, G. Swift and C. Carmichael, "Single Event Upset Susceptibility Testing of the Xilinx Virtex II FPGA,", *MAPLD'02*, Laurel MD, 10-12 Sept. 2002.

[7] C. Yui, G. Swift, C. Carmichael, R. Koga and J. George, "SEU Mitigation Testing of Xilinx Virtex II FPGAs,", *NSREC'03*, Monterrey, CA, 21-25 July 2003.

| Dokument ID Document ID | Frisläppt datum Date Released | Utgåva <i>Issue</i> | Informationsklass Classification | Sida Page |
|-------------------------|-------------------------------|---------------------|----------------------------------|-----------|
| D-P-REP-01272-SE        | 2004-11-03                    | 1                   | Öppen                            | 26        |

### A APPENDIX

- 1. Run#107 has been excluded due to problems with the configuration interface that caused an abnormal high number of detected SEFIs.
- 2. These test runs was ended when current limit turned off the power. This was an effect of the bad shape of the device.

|                  |     |        |        |                |            | <u>.</u> |                |           |         |             |                     |
|------------------|-----|--------|--------|----------------|------------|----------|----------------|-----------|---------|-------------|---------------------|
| Test run#        | lon | LETeff | DUT    | Design variant | Mitigation | Module   | ErrorTypeClass | Mean Flux | Fluence | # of Errors | Error cross section |
| <sup>2</sup> 104 | Si  | 11     | SN#E11 | XTMR_V2        | TMR        | FFT      | DataError      | 4,50E+02  | 1,3E+05 | 9           | 7,1E-05             |
|                  |     |        |        |                |            |          | DataError      | 4,50E+02  | 1,3E+05 | 10          | 7,8E-05             |
|                  |     |        |        |                |            | PSR      | DataError      | 4,50E+02  | 1,3E+05 | 1           | 7,8E-06             |
|                  |     |        |        |                |            |          | DataError      | 4,50E+02  | 1,3E+05 | 1           | 7,8E-06             |
| <sup>2</sup> 105 | Si  | 11     | SN#E1  | XTMR_V2        | TMR        | FFT      | DataError      | 8,90E+02  | 2,0E+05 | 23          | 1,1E-04             |
|                  |     |        |        |                |            |          | DataError      | 8,90E+02  | 2,0E+05 | 15          | 7,4E-05             |
|                  |     |        |        |                |            | PSR      | DataError      | 8,90E+02  | 2,0E+05 | 6           | 3,0E-05             |
|                  |     |        |        |                |            |          | DataError      | 8,90E+02  | 2,0E+05 | 4           | 2,0E-05             |
|                  |     |        |        |                |            | Control  | SEFI           | 8,90E+02  | 2,0E+05 | 2           | 9,9E-06             |
| <sup>1</sup> 107 | Si  | 11     | SN#E1  | XTMR_V2        | TMR        | FFT      | DataError      | 1,30E+02  | 3,0E+05 | 7           | 2,3E-05             |
|                  |     |        |        |                |            |          | DataError      | 1,30E+02  | 3,0E+05 | 5           | 1,7E-05             |
|                  |     |        |        |                |            | PSR      | DataError      | 1,30E+02  | 3,0E+05 | 1           | 3,3E-06             |
|                  |     |        |        |                |            |          | DataError      | 1,30E+02  | 3,0E+05 | 5           | 1,7E-05             |
|                  |     |        |        |                |            | Control  | SEFI           | 1,30E+02  | 3,0E+05 | 1           | 3,3E-06             |
|                  |     |        |        |                |            |          | SEFI           | 1,30E+02  | 3,0E+05 | 19          | 6,3E-05             |
| 108              | Si  | 11     | SN#E1  | XTMR_V2        | TMR_2CLK   | FFT      | DataError      | 1,00E+02  | 1,0E+05 | 292         | 2,9E-03             |
|                  |     |        |        |                |            |          | DataError      | 1,00E+02  | 1,0E+05 | 267         | 2,7E-03             |
|                  |     |        |        |                |            | PSR      | DataError      | 1,00E+02  | 1,0E+05 | 167         | 1,7E-03             |
|                  |     |        |        |                |            |          | DataError      | 1,00E+02  | 1,0E+05 | 162         | 1,6E-03             |

| Dokument ID Document ID | Frisläppt datum Date Released | Utgåva <i>Issue</i> | Informationsklass Classification | Sida Page |
|-------------------------|-------------------------------|---------------------|----------------------------------|-----------|
| D-P-REP-01272-SE        | 2004-11-03                    | 1                   | Öppen                            | 27        |

| Test run#        | lon | LETeff | DUT   | DesignMethod | Tripple  | Module  | ErrorTypeClass | Mean Flux | Fluence | # of Errors | Error cross section |
|------------------|-----|--------|-------|--------------|----------|---------|----------------|-----------|---------|-------------|---------------------|
| <sup>2</sup> 109 | Si  | 11     | SN#E1 | XTMR_V1      | TMR      | FFT     | DataError      | 1,30E+02  | 1,9E+05 | 10          | 5,4E-05             |
|                  |     |        |       |              |          |         | DataError      | 1,30E+02  | 1,9E+05 | 5           | 2,7E-05             |
|                  |     |        |       |              |          | PSR     | DataError      | 1,30E+02  | 1,9E+05 | 7           | 3,7E-05             |
|                  |     |        |       |              |          |         | DataError      | 1,30E+02  | 1,9E+05 | 6           | 3,2E-05             |
|                  |     |        |       |              |          | Control | SEFI           | 1,30E+02  | 1,9E+05 | 4           | 2,1E-05             |
| <sup>2</sup> 110 | Si  | 11     | SN#E1 | XTMR_V1      | TMR_2CLK | FFT     | DataError      | 1,00E+02  | 2,3E+04 | 57          | 2,4E-03             |
|                  |     |        |       |              |          |         | DataError      | 1,00E+02  | 2,3E+04 | 49          | 2,1E-03             |
|                  |     |        |       |              | l        | PSR     | DataError      | 1,00E+02  | 2,3E+04 | 22          | 9,4E-04             |
|                  |     |        |       |              |          |         | DataError      | 1,00E+02  | 2,3E+04 | 28          | 1,2E-03             |
| <sup>2</sup> 111 | Si  | 11     | SN#E1 | XTMR_V1 T    | TMR      | FFT     | FFT DataError  | 6,70E+02  | 2,6E+05 | 28          | 1,1E-04             |
|                  |     |        |       |              |          |         | DataError      | 6,70E+02  | 2,6E+05 | 24          | 9,2E-05             |
|                  |     |        |       |              |          | PSR     | DataError      | 6,70E+02  | 2,6E+05 | 22          | 8,4E-05             |
|                  |     |        |       |              |          |         | DataError      | 6,70E+02  | 2,6E+05 | 25          | 9,6E-05             |
| 113              | Si  | 11     | SN#E1 | XTMR_V1      | TMR      | FFT     | DataError      | 1,09E+03  | 3,0E+05 | 59          | 1,9E-04             |
|                  |     |        |       |              |          |         | DataError      | 1,09E+03  | 3,0E+05 | 42          | 1,4E-04             |
|                  |     |        |       |              |          | PSR     | DataError      | 1,09E+03  | 3,0E+05 | 39          | 1,3E-04             |
|                  |     |        |       |              |          |         | DataError      | 1,09E+03  | 3,0E+05 | 54          | 1,8E-04             |
|                  |     |        |       |              |          | Control | SEFI           | 1,09E+03  | 3,0E+05 | 3           | 9,9E-06             |
| 114              | Si  | 11     | SN#E1 | XTMR_V1      | no-TMR   | FFT     | DataError      | 1,00E+02  | 1,0E+05 | 158         | 1,6E-03             |
|                  |     |        |       |              |          |         | DataError      | 1,00E+02  | 1,0E+05 | 125         | 1,2E-03             |
|                  |     |        |       |              |          | PSR     | DataError      | 1,00E+02  | 1,0E+05 | 62          | 6,2E-04             |
|                  |     |        |       |              |          |         | DataError      | 1,00E+02  | 1,0E+05 | 67          | 6,7E-04             |
|                  |     |        |       |              |          | Control | SEFI           | 1,00E+02  | 1,0E+05 | 5           | 5,0E-05             |

| Dokument ID Document ID | Frisläppt datum Date Released | Utgåva <i>Issue</i> | Informationsklass Classification | Sida Page |
|-------------------------|-------------------------------|---------------------|----------------------------------|-----------|
| D-P-REP-01272-SE        | 2004-11-03                    | 1                   | Öppen                            | 28        |

| Test run# | lon | Proton | DUT   | Design variant | Mitigation | Module | ErrorTypeClass | Mean Flux | Fluence | # of Errors | Error cross section |         |    |
|-----------|-----|--------|-------|----------------|------------|--------|----------------|-----------|---------|-------------|---------------------|---------|----|
| 201       | p+  | 60 MeV | SN#C1 | XTMR_V2        | TMR_2CLK   | FFT    | DataError      | 1,00E+08  | 1,0E+10 | 65          | 6,5E-09             |         |    |
|           |     |        |       |                |            |        | DataError      | 1,00E+08  | 1,0E+10 | 53          | 5,3E-09             |         |    |
|           |     |        |       |                |            | PSR    | DataError      | 1,00E+08  | 1,0E+10 | 22          | 2,2E-09             |         |    |
|           |     |        |       |                |            |        | DataError      | 1,00E+08  | 1,0E+10 | 23          | 2,3E-09             |         |    |
| 202       | p+  | 60 MeV | SN#C1 | XTMR_V2        | TMR        | FFT    | DataError      | 1,00E+08  | 4,0E+10 | 10          | 2,5E-10             |         |    |
|           |     |        |       |                |            |        | DataError      | 1,00E+08  | 4,0E+10 | 4           | 1,0E-10             |         |    |
|           |     |        |       |                |            | PSR    | DataError      | 1,00E+08  | 4,0E+10 | 1           | 2,5E-11             |         |    |
|           |     |        |       |                |            |        | DataError      | 1,00E+08  | 4,0E+10 | 1           | 2,5E-11             |         |    |
| 204       | p+  | 60 MeV | SN#C1 | XTMR_V2        | TMR        | FFT    | DataError      | 3,17E+07  | 4,0E+10 | 1           | 2,5E-11             |         |    |
| 206       | p+  | 60 MeV | SN#C1 | XTMR_V2 1      | TMR        | FFT    | DataError      | 9,52E+08  | 4,0E+10 | 17          | 4,3E-10             |         |    |
|           |     |        |       |                |            |        | DataError      | 9,52E+08  | 4,0E+10 | 28          | 7,0E-10             |         |    |
|           |     |        |       |                |            | PSR    | DataError      | 9,52E+08  | 4,0E+10 | 1           | 2,5E-11             |         |    |
| 209       | p+  | 60 MeV | SN#E1 | XTMR_V2        | TMR        | FFT    | DataError      | 9,52E+08  | 5,0E+10 | 13          | 2,6E-10             |         |    |
|           |     |        |       |                |            |        | DataError      | 9,52E+08  | 5,0E+10 | 11          | 2,2E-10             |         |    |
| 211       | p+  | 60 MeV | SN#E1 | XTMR_V2        | TMR        | FFT    | DataError      | 1,18E+08  | 4,0E+10 | 2           | 5,0E-11             |         |    |
|           |     |        |       |                |            |        | DataError      | 1,18E+08  | 4,0E+10 | 1           | 2,5E-11             |         |    |
|           |     |        |       |                |            | PSR    | DataError      | 1,18E+08  | 4,0E+10 | 1           | 2,5E-11             |         |    |
| 212       | p+  | 60 MeV | SN#E1 | XTMR_V2        | TMR_2CLK   | FFT    | DataError      | 1,18E+08  | 2,0E+10 | 99          | 5,0E-09             |         |    |
|           |     |        |       |                |            |        |                |           |         | DataError   | 1,18E+08            | 2,0E+10 | 94 |
|           |     |        |       |                |            | PSR    | DataError      | 1,18E+08  | 2,0E+10 | 45          | 2,3E-09             |         |    |
|           |     |        |       |                |            |        | DataError      | 1,18E+08  | 2,0E+10 | 52          | 2,6E-09             |         |    |
| 213       | p+  | 60 MeV | SN#E1 | XTMR_V2        | TMR        | FFT    | DataError      | 2,00E+07  | 4,0E+10 | 0           | 0,0E+00             |         |    |
|           |     |        |       |                |            |        | DataError      | 2,00E+07  | 4,0E+10 | 0           | 0,0E+00             |         |    |
|           |     |        |       |                |            | PSR    | DataError      | 2,00E+07  | 4,0E+10 | 0           | 0,0E+00             |         |    |
|           |     |        |       |                |            |        | DataError      | 2,00E+07  | 4,0E+10 | 0           | 0,0E+00             |         |    |