Memory scrubbing

Memory scrubbing

Memory scrubbing is the process of detecting and correcting bit errors in computer memory by using error-detecting codes like ECC.

Contents

Motivation for scrubbing

Due to the high integration density of contemporary computer memory chips, the individual memory cell structures became small enough to be vulnerable to cosmic rays and/or alpha particle emission. The errors caused by these phenomena are called soft errors. This can be a problem for DRAM and SRAM based memories.

The probability of a soft error at any individual memory bit is very small. But,

  • together with the large amount of memory with which computers - especially servers - are equipped nowadays,
  • and together with several months of uptime,

the probability of soft errors in the total memory installed is significant.

ECC support for scrubbing

The information in an ECC memory is stored redundantly enough to correct single bit error per memory word. Hence, an ECC memory can support the scrubbing of the memory content. Namely, if the memory controller scans systematically through the memory, the single bit errors can be detected, the erroneous bit can be determined using the ECC checksum, and the corrected data can be written back to the memory.

Scrubbing in more detail

It is important to check each memory location periodically, frequently enough, before multiple bit errors within the same word are too likely to occur, because the one bit errors can be corrected, but the multiple bit errors are not correctable, in the case of usual (as of 2008) ECC memory modules.

In order to not disturb regular memory requests from the CPU and thus prevent decreasing performance, scrubbing is usually only done during idle periods. As the scrubbing consists of normal read and write operations, it may increase power consumption for the memory compared to non-scrubbing operation. Therefore, scrubbing is not performed continuously but periodically. For many servers, the scrub period can be configured in the BIOS setup program.

The normal memory reads issued by the CPU or DMA devices are checked for ECC errors, but due to data locality reasons they can be confined to a small range of addresses and keeping other memory locations untouched for a very long time. These locations can become vulnerable to more than one soft error, while scrubbing ensures the checking of the whole memory within a guaranteed time.

On some systems, not only the main memory (DRAM-based) is capable of scrubbing but also the CPU caches (SRAM-based). On most systems the scrubbing rates for both can be set independently. Because cache is much smaller than the main memory, the scrubbing for caches does not need to happen as frequently.

Memory scrubbing increases reliability, therefore it can be classified as a RAS feature.

See also


Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • Memory controller — The memory controller is a digital circuit which manages the flow of data going to and from the main memory. It can be a separate chip or integrated into another chip, such as on the die of a microprocessor. This is also called a Memory Chip… …   Wikipedia

  • Data scrubbing — Not to be confused with Data cleansing or Sanitization (classified information). Data scrubbing is an error correction technique which uses a background task that periodically inspects memory for errors, and then corrects the error using ECC …   Wikipedia

  • POWER1 — The POWER1 is a multi chip CPU developed and fabricated by IBM that implemented the POWER instruction set architecture (ISA). It was originally known as the “RISC System/6000 CPU” or when an abbreviated form, the “RS/6000 CPU” before introduction …   Wikipedia

  • DDR SDRAM — This article is about DDR SDRAM. For graphics DDR, see GDDR. Generic DDR 266 Memory in the 184pin DIMM form …   Wikipedia

  • DDR SDRAM — У этого термина существуют и другие значения, см. DDR. типы DRAM памяти FPM RAM EDO RAM Burst EDO RAM SDRAM DDR SDRAM DDR2 SDRAM DDR3 SDRAM DDR4 SDRAM Rambus RAM QDR SDRAM VRAM WRAM SGRAM GDDR2 …   Википедия

  • iOS version history — Contents 1 Overview 2 Versions 2.1 Unreleased versions …   Wikipedia

  • ZFS — Infobox Filesystem full name = ZFS name = ZFS developer = Sun Microsystems introduction os = OpenSolaris introduction date = November 2005 partition id = directory struct = Extensible hash table file struct = bad blocks struct = max filename size …   Wikipedia

  • Mobile operating system — A mobile operating system, also known as a mobile OS, mobile software platform or a handheld operating system, is the operating system that controls a mobile device or information appliance similar in principle to an operating system such as… …   Wikipedia

  • RAID — This article is about the data storage technology. For other uses, see Raid (disambiguation). RAID, an acronym for Redundant Array of Independent Disks (originally Redundant Array of Inexpensive Disks),[1] is a storage… …   Wikipedia

  • Music video game — Open source music video game StepMania A music video game, also commonly known as a music game, is a video game where the gameplay is meaningfully and often almost entirely oriented around the player s interactions with a musical score or… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”