FPS AP-120B

FPS AP-120B

The FPS AP-120B was a pipeline-oriented array processor manufactured by Floating Point Systems. It was designed to be attached to a host computer such as a DEC PDP-11 as a fast number-cruncher. Data transfer was accomplished using a Direct memory access connection.

Processor cycle time was 167 nanoseconds, giving a speed of 6 MHz. Since it could present two floating point results per cycle, one from the adder and the other from the multiplier, a capacity of 12 Megaflops was claimed for the processor.

Architecture

The processor was designed around the concept of multiple parallel processing units operating in synchronization. A single 64-bit instruction word was divided into fields, each of which instructed a particular module under the control of the CPU. The modules were as follows:
* 16-bit Arithmetic and Logic unit (ALU)
* 38-bit Floating Point Adder (FADD) (two stages)
* 38-bit Floating Point Multiplier (FMUL) (three stages)
* Two Data Pad registers for receiving data from memory.

The processor had access to dual-interleaved core memory in which odd numbered addresses were stored in one physical bank, and even numbered addresses were stored in the other. This represented an attempt to take advantage of typical sequential fetching of memory words. Fetching sequentially from one physical bank would result in a latency of two instruction cycles before the data was loaded into the destination data pad. Interleaving allowed a sequential access to occur immediately after the previous one. Both accesses took two cycles to complete, but the overlap and dual destination pads maximized the use of the data channel.

The floating point arithmetic modules were both multi-stage processors which were driven by explicit instructions. In the two-stage adder an assembler instruction such as FADD DX,DY would load values from data pads DX and DY into stage one of the adder. A subsequent FADD instruction would be required to present the result at the adder's output. This second FADD could be a dummy with no arguments, or it could be the next calculation in a sequence. In this fashion a stream of FADD operations could be performed in a pipeline, with a new result in every instruction cycle though every addition requires two cycles.

Similarly the multiplier, a three-stage unit, required one FMUL DX,DY to begin a multiplication, followed by two more FMUL instructions to produce the result. Careful programming of the pipeline allowed the production of one result per cycle, with each calculation taking three cycles in itself.

For maximum efficiency all calculations were programmed using the assembler language supplied with the hardware. A high-level language resembling Fortran was provided for coordinating tasks and controlling data transfers to and from the host computer.

Lookup tables

In order to support typical applications in signal processing, the hardware was delivered with a pre-calculated lookup table of sine and cosine values. Sines and cosines for angles from 0 to π/2 radians were stored in alternate addresses to take advantage of the interleaving described above. Values for all other angles could be calculated by using one or other of the values from the lookup table, negating if necessary, using well-known rules.

Typical programming style

This was unusual, being driven by the synchronous parallel processing architecture. The basic philosophy can be summarized as follows:
* Lay out the shortest sequence of instructions for performing one instance of the desired calculation, allowing for two-cycle memory latency, and the driving of the floating-point modules with explicit FADD and FMUL instructions.
* Inspect the sequence to determine the minimum number of instructions forming a loop which will perform the calculation repetitively. This requires attention to resource conflicts. For instance the data bus for moving results around can only move one data word per cycle. Likewise the ALU, used mostly for counting loops and memory addressing, can only be used for one purpose per cycle. This step is typically trial-and-error.
* Conceptually "wrap" the full sequence of instructions around the loop, using FADD and FMUL instructions to drive calculations through the pipelines.
* Before the loop begins, add parallel process initiations as required.

The final item was accomplished as follows: assume that the entire calculation requires 15 cycles, and the minimum loop size is 5 cycles. The first 5 instruction words begin iteration 1 of the calculation. The second 5 words contain both iteration 1, and the beginning of iteration 2 in parallel. This usually would be a copy of the operations beginning iteration 1. The next 5 words contain the final steps of iteration 1, the middle of iteration 2, and the beginning of iteration 3. These five words form the body of the loop which repeats until the desired number of data points have been processed.

References

* Page 206 ff, "Parallel Computers Two: Architecture, Programming and Algorithms", by Roger W. Hockney, C. R. Jesshope. CRC Press 1988 ISBN 0852748116


Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • AIM-120 AMRAAM — An AIM 120 AMRAAM mounted on the wingtip station of a General Dynamics F 16 Fighting Falcon Type Medium range, active radar homing air to air missile …   Wikipedia

  • Floating Point Systems — Inc. (FPS)  американская компания, производитель минисуперкомпьютеров, располагавшаяся в Бивертоне (штат Орегон). Компания была основана в 1970 году Нормом Виннингштадом, бывшим инженером компании Tektronix. Первоначальной целью компании… …   Википедия

  • PAVE PAWS — Coverage of PAVE PAWS is shown in blue. This complements the coverage provided by the BMEWS system in red. Both report back to Cheyenne Mountain Air Base in Colorado …   Wikipedia

  • Floating Point Systems — Inc. (FPS) was a Beaverton, Oregon vendor of minisupercomputers. The company was founded in 1970 by former Tektronix engineer Norm Winningstad.The original goal of the company was to supply floating point coprocessors for minicomputers. In 1976,… …   Wikipedia

  • Air Force Space Surveillance System — Part of the master transmitter antenna at Lake Kickapoo, Texas c.2001. The Air Force Space Surveillance System, colloquially known as the Space Fence, is a multistatic radar system that detects orbital objects passing over America. It is a… …   Wikipedia

  • AGM-65 Maverick — Type Air to surface guided missile Place of origin …   Wikipedia

  • Global Positioning System — GPS redirects here. For other uses, see GPS (disambiguation). Geodesy Fundamentals …   Wikipedia

  • Heckler & Koch MP5 — Type Submachine gun Place of origin …   Wikipedia

  • M16 rifle — Rifle, 5.56 mm, M16 From top to bottom: M16A1, M16A2, M4A1, M16A4 Type Assault rifle Place  …   Wikipedia

  • United States Air Force — USAF redirects here. For other uses, see USAF (disambiguation). The U.S. Air Force redirects here. For the song, see The U.S. Air Force (song). United States Air Force …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”