November 6, 2019

Cosmic Mishaps Part II: How Cosmic Rays Can Disrupt Your Design 

In our last blog post, we talked about cosmic rays, and how they might negatively impact your printed circuit board design.  But cosmic rays are not the only source of terrestrial radiation.  Anytime an atom / molecule can emit mass or energy and enter a lower energy state, the laws of physics say it will do just that.  Nuclear power and research engineers design electronics for high-radiation environments, as do medical-device design engineers, and even energy company engineers who design downhole sensor platforms, where gamma-ray sensors log radiation levels and detect shale deposits.

So we will start with how radiation affects your printed circuit board design, and later discuss some mitigation techniques.

Event Classifications

From the study of radiation effects on electronic designs, scientists have grouped the following failures, called Soft Errors.  Here is a partial list

Single Event Effects

  • Single-Event Transient effects temporarily change the output of the device. This provides false analog readings and incorrect digital logic-levels.
  • Physics Single-Bit Upset causes a failure in a single bit of a memory cell.
  • Physical Multibit Upset changes the value of more than one memory bits. Anywhere from two to hundreds or even thousands of bits might be affected
    • Multi-Bit Upsets occur when more than one bit in a word is affected.
    • Multi-Cell Upsets occur when bits from multiple words are affected.
  • Single-Event Stuck Bits describe when a bit is permanently flipped.
  • Single-Event Latchup / Functional Interrupt occurs when control logic is affected to the point that the device no longer functions. If there is no permanent damage, power cycling the device and perhaps rewriting data to memory should fix the error.
  • Catastrophic part failures occur when a high-energy particle creates a conductive path through a transistor. This can cause Gate Latchup, Gate Burnout, and Gate Rupture, conditions that might require device replacement.

Charged particles create ionization paths with free electrons and holes in a semiconductor.  Often, the electrons and holes recombine and the device returns to its previous state.  If the path of the particle is through the depletion region of an NMOS or PMOS transistor, the impact can change the state of the storage element.[1]  If the path of the particle is through an insulator, it can free electrons and leave positive charged ions in its wake.  This has the effect of change logic levels as well as logic-threshold voltages.

[1] Sydow, M. (n.d.). WP-01206-1.0 White Paper – intel.com. Retrieved October 3, 2019, from https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/wp/wp-01206-introduction-single-event-upsets.pdf.

The image above is from the Altera WP-01206-1.0 White Paper entitled “Introduction to Single-Event Upsets.”  The paper states “when the path crosses the depletion region underneath a drain-gate-source region the electrons it creates can be quickly attracted to a higher voltage NMOS drain diffusion, which sometimes results in the change of state of a storage element. Similarly, for PMOS transistors, the holes can be quickly attracted to a lower voltage PMOS diffusion. Even if the ionization path is nearby, the diffusion path of electrons or holes could also result in a storage element upset as well.

But neutral neutrons can cause damage as well — either by colliding with molecules and creating secondary charged particles, or by creating electron-hole pairs in a gate-source region or p-n junction.  Then, as in the previous example, NMOS and PMOS transistor storage elements can change state.

The image above is from the Altera WP-01206-1.0 White Paper entitled “Introduction to Single-Event Upsets.” Once the particle creates an electron-hole pair in either a PMOS or NMOS transistor, the free charges can disturb the proper function of the transistor.

Cosmic Ray Impacts and Failures in Time

It is impossible to know exactly when and where a Cosmic Ray will strike an integrated circuit on your printed circuit board.  But enough cosmic ray data has been collected at the surface of the Earth to form models based on latitude, longitude, and altitude to come up with a decent estimate.  For example, the neutron flux calculation at http://seutest.com provides the flux at different latitudes / longitudes / altitudes relative to New York City (a common reference location) where the neutron flux is 13 neutrons per square centimeter per hour.  At an elevation of 5 miles, the flux increases to 1300 neutrons per square centimeter per hour.

If you know the size of the transistor gate/drain/source on the silicon die, and the size of a neutron, a Monte-Carlo integration can provide a probability of impact.  Remember, smaller devices are less likely to be hit, but lower operating voltages and are more likely to be affected by a near-miss impact.

Soft-Error Rates& Failures in Time

Since it is impossible to know exactly when or where a cosmic ray impact will occur, electrical engineers are left to statistics to determine the probability of impact.  Manufacturers provide cosmic-ray data separate from terrestrial radiation sources in terms of Failure-in-Time, or FIT units.  Memory suppliers provide data normalized to FITs/Mb or FIT/bit.  One FIT unit is equal to one failure in 1-billion device hours (109 h.)  That is not to say that a single device with a FIT of 1 will work for 1 billion-hours.  Instead the metric is applied to large groups of products.

For example, if your circuit was installed on 500,000 cars and had a single transistor with a 14FIT rating, you could anticipate a failure every 6 days.

This might be acceptable for a fan on a air-sensor unit that measures cabin air quality.  But it w ould be wholly unacceptable risk for a device that controls, say, the cars throttle on a Toyota Prius.

As another example, 700 FIT/Mbit might be an appropriate cosmic-ray induced soft-error rate for a plane flying at an altitude of 1.4 miles.  How many SER events can be expected in 512 MByte (4.096×109 bits) of memory each month?

Like everywhere else in electronics device manufacturer, “specmanship” has invaded rad-hard datasheets.  So before you dismiss one electronics manufacturer for another, you should research how their electronics devices were tested.  Were they accelerated or actual tests?  How were the FIT values determined?

The JEDEC standard JESD89A “Measurement and reporting of Alpha Particle and Terrestrial Cosmic Ray Induced Soft Errors in Semiconductor Devices” details how to test and report the soft-error-rate of custom electronics designs.  And it is important to note that radioactive decay, thermal neutrons, and contaminants can induce errors in addition to cosmic ray activity.

Summary

This first article in this series detailed what cosmic rays are and how they impact electronic designs.  This article covered how cosmic rays impact your integrated circuits.  The next article will cover how you, the electrical engineer, can fight back against the damaging effects of radiation.

Get a Quote