Prefetch input queue

Prefetch input queue

Most modern processors load their instructions some clock cycles before they execute them. This is achieved by pre-loading machine code from memory into a prefetch input queue (PIQ).

This behavior only applies to von Neumann computers (that is, not Harvard architecture computers) that can run self-modifying code and have some sort of instruction pipelining. Nearly all computers fulfill these three requirements.

Usually, the prefetching behavior of the PIQ is invisible to the programming model of the CPU. However, there are some circumstances where the behavior of PIQ is visible, and needs to be taken into account by the programmer.

When the x86-processor changes mode from realmode to protected mode and vice versa, the PIQ has to be flushed, or else the CPU will continue to translate the machine code as if it were written in its last mode. If the PIQ is not flushed, the processor might translate its codes wrong and generate an invalid instruction exception.

When executing self-modifying code, a change in the processor code immediately in front of the current location of execution might not change how the processor interprets the code, as it is already loaded into its PIQ. It simply executes its old copy already loaded in the PIQ instead of the new and altered version of the code in its RAM and/or cache.

This behavior of the PIQ can be used to determine if code is being executed inside an emulator or directly on the hardware of a real CPU. Most emulators will "probably" never simulate this behavior. If the PIQ-size is zero (changes in the code "always" affect the state of the processor immediately), it can be deduced that either the code is being executed in an emulator or the processor invalidates the PIQ upon writes to addresses loaded in the PIQ.

x86 example code

code_starts_here: mov eax, ahead mov [eax] , 0x9090 ahead: jmp near to_the_end ; Some other code to_the_end:

This self-modifying program will overwrite the "jmp to_the_end" with two NOPs (which is encoded as "0x9090"). The jump "jmp near to_the_end" is assembled into two bytes of machine code, so the two NOPs will just overwrite this jump and nothing else. (That is, the jump is replaced with a do-nothing-code.)

Because the machine code of the jump is already read into the PIQ, and probably also already executed by the processor (superscalar processors execute several instructions at once, but they "pretend" that they don't because of the need for backward compatibility), the change of the code will not have any change of the execution flow.

Example program to detect the size of the PIQ

This is an example NASM-syntax self-modifying x86-assembly algorithm that determines the size of the PIQ:

code_starts_here: xor cx, cx ; zero register cx xor ax, ax ; zero register ax mov dx, cs mov [code_segment] , dx ; "calculate" codeseg in the far jump below (edx here too) around: cmp ax, 1 ; check if ax has been alterd je found_size mov [nop_field+cx] , 0x90 ; 0x90 = opcode "nop" (NO oPeration) inc cx db 0xEA ; 0xEA = opcode "far jump" dw flush_queue ; should be followed by offset (rm = "dw", pm = "dd") code_segment: dw 0 ; and then the code segment (calculated above) flush_queue: mov [nop_field+cx] , 0x40 ; 0x40 = opcode "inc ax" (INCrease ax) nop_field: nop times 256 jmp around found_size: ; ; register cx now contains the size of the PIQ ; this code is for real mode and 16-bit protected mode, but it could easily be changed into ; running for 32-bit protected mode as well. just change the "dw" for ; the offset to "dd". you need also change dx to edx at the top as ; well. (dw and dx = 16 bit addressing, dd and edx = 32 bit addressing) ;

What this code does is basically that it changes the execution flow, and determines by brute force how large the PIQ is. "How far away do I have to change the code in front of me for it to affect me?" If it is too near (it is already in the PIQ) the update will not have any effect. If it is far enough, the change of the code will affect the program and the program has then found the size of the processor's PIQ. If this code is being executed in protected mode, the operating system must not make any context switch, or else this program may return the wrong value.

See also

* Instruction pipeline
* Assembly language
* CPU design


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • Page replacement algorithm — This article is about algorithms specific to paging. For outline of general cache algorithms (e.g. processor, disk, database, web), see Cache algorithms. In a computer operating system that uses paging for virtual memory management, page… …   Wikipedia

  • Конвейер (процессор) — У этого термина существуют и другие значения, см. Конвейер (значения). Простой пятиуровневый конвейер в RISC процессорах (IF (англ. Instruction Fetch) получение …   Википедия

  • Вычислительный конвейер — У этого термина существуют и другие значения, см. Конвейер (значения). Конвейер  способ организации вычислений, используемый в современных процессорах и контроллерах с целью повышения их производительности (увеличения числа инструкций,… …   Википедия

  • Prefetching — generally means loading something ahead of time and could refer to any one of the following topics:* Instruction prefetch, in computer architecture, a microprocessor speedup technique * Prefetch input queue (PIQ), in computer architecture, pre… …   Wikipedia

  • Protected mode — This article is about an x86 processor mode. For Internet Explorer Protected Mode, see Mandatory Integrity Control. x86 processor modes Mode First supported Real mode Intel 8086 8080 emulation mode NEC …   Wikipedia

  • Instruction pipeline — Pipelining redirects here. For HTTP pipelining, see HTTP pipelining. Basic five stage pipeline in a RISC machine (IF = Instruction Fetch, ID = Instruction Decode, EX = Execute, MEM = Memory access, WB = Register write back). In the fourth clock… …   Wikipedia

  • Конвейер (процессоры)/Перевод — Пожалуйста, не удаляйте эту статью! В данный момент в ней идет работа по переводу основной английской версии для замены кошмарной русской. После завершения работы я объединю получившуюся статью с имеющейся русской версией. Простой пятиуровневый… …   Википедия

  • Self-modifying code — In computer science, self modifying code is code that alters its own instructions, intentionally or otherwise, while it is executing.Self modifying code is quite straightforward to write when using assembly language (taking into account the CPU… …   Wikipedia

  • NOR flash replacement — While flash memory remains one of the most popular storages in embedded systems because of its non volatility, shock resistance, small size, and low energy consumption, its application has grown much beyond its original design. Based on its… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”