+46 18 660 700
Would you like to be contacted?
Quarterly Bulletin
 
ECC solves inevitable bit errors in RAM
 
 

Bit errors occur about once a week in 4GB RAM due to the background radiation. From 2% to 15% of these errors lead to faulty calculations, system crashes or unpredictable behavior. Consequently there’s one serious incident in the computer system every year, at the lowest estimate.

 
The problems have been known for many years and the solution is Error Correcting Code, ECC. The use of ECC has been limited to only high-end processors until now. The latest low-power processors from AMD and Intel not only introduce one-chip architectures but also ECC support to a much wider range of applications. Prices for ECC memories are expected to decrease as the use of the technology are broader and production volumes increase. 
 
A bit error in memory is the spontaneous shift from correct to incorrect state. Causes are separated into two groups. So called hard errors are bit errors occurring from for instance great temperature variations or mechanical stress. Soft errors are due to the background radiation. Soft errors are several decimal factors more frequent than hard errors.
 

Detecting and correcting errors using 8-bit check sum

ECC involves the calculation of an 8 bit check sum when data is written to the RAM memory. The result is saved along with the 64 bits of data from which it was calculated. A part of the process to read data (and the check sums) from RAM is that the processor calculates the check sum once again. The additional calculations in a system using ECC decreases performance by 2% to 4% compared to a non-ECC system.
A field study published 2009 (DRAM Errors in the Wild: A Large-Scale Field Study) was made using thousands of Google servers. Results included mean correctable error rates of 2000-6000 per GB and year.
The check sum calculated by the processor is compared with the check sum read from memory. The result of the comparison is used to detect and correct single bit errors. The algorithm is able to detect two bit errors (side by side) as well but can only correct one of them.  
 
The support for ECC has been most common for high-end processors. The use of this functionality for detecting and correcting memory errors has been most common in server, financial and banking applications where the acceptance for errors is extremely low. 
 

ECC support in processors from Intel and AMD

The recent releases of low power Intel Atom Processor 3800 series and AMD Embedded G-Series SOC both offer support for ECC. Thus ECC support is offered in processors in the 20 dollar segment and ECC is thereby introduced in a large and completely new market segment.
 
Medical technology and aerospace are two relevant application areas for ECC. In both areas strict requirements for safety and reliability are crucial. Aerospace is likely to suffer from additional challenges since radiation increases on high altitudes. Both application areas are subject to safety standards and regulatory approval. Additional application areas where ECC may be used for safety reasons are oil and gas, marine, offshore, rolling stock and transportation where embedded computers are vital for safety.

Products with ECC support

H1503 - Fanless Box PC

 - Intel Atom E3845/E3815 -
- 2GB DDR3 onboard, ECC -
- mSATA, eMMC (option) -
- 2.5 inch SATA drive bay -
- Wi-Fi, 3G, GPRS -

H6828 - COM Express module

  - 4th gen. Intel® Core™ i3/i5/i7 processors -
- Two 204-pin DDR3L SODIMM sockets - 
- DDR3L 1333/1600MHz SODIMM -
- Up to 16GB system memory -
The price difference comparing RAM memory with and without ECC is small for large densities. The reason is that RAM in large densities are aimed at server applications where today ECC is more or less always used. Production volumes for ECC memory modules are large and prices are therefore competitive.
 
Expect to find bigger price differences for densities and form factors (<16GB, SO-DIMM or soldered) commonly used in embedded applications. The additional cost for ECC support in RAM has decreased some recently and can be expected to decrease further in the future since AMD and Intel both have introduced the ECC opportunity in their low-power SOCs for the embedded segment.
 

Targeting new application areas

Decreasing prices and increasing availability for ECC memory aimed at the embedded segment will promote the use in applications outside the strict safety critical areas. ECC could be used to avoid miscalculations as forest harvesters are logging trees while collecting data for billing purposes. The possibility to avoid at least a couple of system crashes, the time, cost and badwill associated with them during the lifetime of the product is possibly enough motivation to introduce ECC in the application.

We would like to increase the awareness of memory bit errors, causes, effects and solutions to the problems. The belief is that ECC memories will be of interest to new types of applications, additional customers and that production volumes will increase and prices drop. The driving forces behind this scenario is new low-power processors supporting ECC from the two big X86 processor manufacturers and the fact that there’s as a result of that processor platforms in the 20 dollar segment supporting the functionality. One memory bit error a week and possibly a couple of serious incidents every year may be avoided. It’s why it is well worth to evaluate the investment in ECC in your application.
 

Market segments

Meeting requirements from industry sectors.
 
Learn more

Technologies

Promoting a deeper understanding of technical possibilities at hand.
 
Learn more

Case studies

Development and production for industrial customers.
 
Learn more

Bits & Pieces bulletin

Technical articles, inspiring case studies product news and updates. B&P is distributed quarterly to registered users.
 

Subscribe

Enter your e-mail address and click subscribe.
 
 

Products

Technologies
 
Find us on
 
 
Hectronic AB | Phone: +46 18 660 700 | E-mail: info@hectronic.se | Sitemap | Cookies
© 2017 Hectronic