Memory ordering

Memory ordering

Memory ordering is a group of properties of the modern microprocessors, characterising their possibilities in memory operations reordering. It is a type of out-of-order execution. Memory reordering can be used to fully utilize different cache and memory banks.

On most modern uniprocessors memory operations are not executed in the order specified by the program code. But from the programmer's point of view, all operations appear to have been executed in the order specified, with all inconsistencies hidden by hardware.


Contents

In SMP microprocessor systems

There are several memory-consistency models for SMP systems:

  • sequential consistency (All reads and all writes are in-order)
  • relaxed consistency (Some types of reordering are allowed)
    • Loads can be reordered after Loads (for better working of cache coherency, better scaling)
    • Loads can be reordered after Stores
    • Stores can be reordered after Stores
    • Stores can be reordered after Loads
  • weak consistency (Reads and Writes are arbitrarily reordered, limited only by explicit memory barriers)

On some CPUs atomic operations can be reordered with Loads and Stores.

Also, there can be

  • Dependent Loads Reordered is unique for Alpha. This processor can fetch data before it fetches pointer to this data. It make cache hardware simpler and faster, but leads to the requirement of memory barriers for readers and writers.
  • Incoherent Instruction cache pipeline (which prevent self-modifying code to be executed without special ICache flush/reload instructions)
Memory ordering in some architectures [1][2]
Type Alpha ARMv7 PA-RISC POWER SPARC RMO SPARC PSO SPARC TSO x86 x86 oostore AMD64 IA64 zSeries
Loads reordered after Loads Y Y Y Y Y Y Y
Loads reordered after Stores Y Y Y Y Y Y Y
Stores reordered after Stores Y Y Y Y Y Y Y Y
Stores reordered after Loads Y Y Y Y Y Y Y Y Y Y Y Y
Atomic reordered with Loads Y Y Y Y Y
Atomic reordered with Stores Y Y Y Y Y Y
Dependent Loads reordered Y
Incoherent Instruction cache pipeline Y Y Y Y Y Y Y Y Y Y

Some older x86 and AMD systems have weaker memory ordering[3]

SPARC memory ordering modes:

  • SPARC TSO = total-store order (default)
  • SPARC RMO = relaxed-memory order (not supported on recent CPUs)
  • SPARC PSO = partial store order (not supported on recent CPUs)

Memory barriers types

Compiler memory barrier

These barriers prevent a compiler from reordering instructions, they do not prevent reordering by CPU.

  • The GNU inline assembler statement
asm volatile("" ::: "memory");

or even

__asm__ __volatile__ ("" ::: "memory");

forbids GCC compiler to reorder read and write commands around it.[4]

__memory_barrier()

intrinsics.[5][6]

  • Microsoft Visual C++ Compiler:[7]
_ReadWriteBarrier()

Hardware memory barrier

Many architectures with SMP support have special hardware instruction for flushing reads and writes.

lfence (asm), void_mm_lfence(void)
sfence (asm), void_mm_sfence(void) [8]
mfence (asm), void_mm_mfence(void) [9]
sync (asm)
dcs (asm)
  • ARMv7
dmb (asm)

GCC since version 4.1.0 and intel c++ compiler have special builtin for calling full hardware memory barrier:

__sync_synchronize().

Asm memory barrier (see above, "Compiler memory barrier") is also issued by this builtin in GCC;

See also

References

  1. ^ Memory Ordering in Modern Microprocessors by Paul McKenney
  2. ^ Memory Barriers: a Hardware View for Software Hackers, Figure 5 on Page 16
  3. ^ Table 1. Summary of Memory Ordering, from "Memory Ordering in Modern Microprocessors, Part I"
  4. ^ GCC compiler-gcc.h
  5. ^ ECC compiler-intel.h
  6. ^ Intel(R) C++ Compiler Intrinsics Reference

    Creates a barrier across which the compiler will not schedule any data access instruction. The compiler may allocate local data in registers across a memory barrier, but not global data.

  7. ^ Visual C++ Language Reference _ReadWriteBarrier
  8. ^ SFENCE — Store Fence
  9. ^ MFENCE — Memory Fence

Further reading


Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • Memory barrier — Memory barrier, also known as membar or memory fence or fence instruction, is a type of barrier and a class of instruction which causes a central processing unit (CPU) or compiler to enforce an ordering constraint on memory operations issued… …   Wikipedia

  • Memory model (computing) — In computing, a memory model describes the interactions of threads through memory and specifies the assumptions the compiler is allowed to make when generating code for segmented memory or paged memory platforms. History and significance A memory …   Wikipedia

  • Memory — For other uses, see Memory (disambiguation). Neuropsychology Topics …   Wikipedia

  • Memory development — The development of memory in children becomes evident within the first 2 to 3 years of a child s life as they show considerable advances in declarative memory. This enhancement continues into adolescence with major developments in short term… …   Wikipedia

  • MEMORY — holocaust literature in european languages historiography of the holocaust holocaust studies Documentation, Education, and Resource Centers memorials and monuments museums film survivor testimonies Holocaust Literature in European Languages The… …   Encyclopedia of Judaism

  • Memory-prediction framework — The memory prediction framework is a theory of brain function that was created by Jeff Hawkins and described in his 2004 book On Intelligence. This theory concerns the role of the mammalian neocortex and its associations with the hippocampus and… …   Wikipedia

  • Encoding (memory) — Memory has the ability to encode, store and recall information. Memories give an organism the capability to learn and adapt from previous experiences as well as build relationships. Encoding allows the perceived item of use or interest to be… …   Wikipedia

  • Commitment ordering — In concurrency control of databases, transaction processing (transaction management), and related applications, Commitment ordering (or Commit ordering; CO; (Raz 1990, 1992, 1994, 2009)) is a class of interoperable Serializability techniques …   Wikipedia

  • Software transactional memory — In computer science, software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock based synchronization. A… …   Wikipedia

  • Art of memory — For the 1966 non fiction book, see The Art of Memory. Graphical memory devices from the works of Giordano Bruno The Art of Memory or Ars Memorativa ( art of memory in Latin) is a general term used to designate a loosely associated group of… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”