- CDC STAR-100
The STAR-100 was a
supercomputer fromControl Data Corporation (CDC), one of the first machines to use avector processor for improved math performance.The name STAR was a construct of the words "STrings" and "ARrays". The 100 came from "100 million floating point operations per second" (
MFLOPS ), the speed at which the machine was designed to operate. The computer was "announced" very early during the 1970s and was supposed to be several times faster than the then reigning world's fastestsupercomputer , theCDC 7600 , which performed at 36 MIPS. On August 17, 1971, Control Data announced that General Motors had placed the first commercial order for a STAR-100.Unfortunately a number of basic design features of the machine meant that its "real world" performance was much lower than expected when first used commercially in 1974, and was one of the primary reasons CDC was pushed from its former dominance in the supercomputer market when the
Cray-1 was announced a few years later.In general organization, the STAR was similar to CDC's earlier supercomputers, where a simple
RISC -like CPU was supported by a number of peripheral processors that offloaded housekeeping tasks and allowed the CPU to crunch numbers as quickly as possible. In the STAR, both the CPU and peripheral processors were deliberately simplified, however, to lower the cost and complexity of implementation. The STAR also differed from the earlier designs by being based on a 64-bit architecture instead of 60-bit, a side effect of the increasing use of 8-bitASCII processing. Also unlike previous machines, the STAR made heavy use ofmicrocode and also supported avirtual memory capability.But the main change in the STAR was the inclusion of special instructions for vector processing. The new and more
complex instructions approximated what was available to users of theAPL programming language and operated on huge vectors of operands that were stored in consecutive locations in memory. The key to streaming these huge vectors of operands was the design of the memory. The physical memory had words that were 512 data bits wide (called SWORDs - superwords) and was divided up into 32 independent banks. The CPU was designed to use these instructions to set up additional hardware that fed in data from themain memory as quickly as possible. For instance, a program could use single instruction with a few parameters to add all the numbers in one 400-value array to another (each array could contain up to 65,535 operands). The CPU only had to decode a single instruction, set up the memory hardware, and start feeding the data into the math units. As with instruction pipelines in general, the performance of any one instruction was no better than it was before, but since the CPU was effectively working on a number of instructions at once (or in this case, data points) the overall performance dramatically improves due to theassembly line nature of the task.The STAR vector operations, being memory-to-memory operations, had a relatively long startup time to fill the pipeline. In contrast to the register-based pipelined functional units in the 7600, the STAR pipelines were much deeper. The problem was compounded by the fact that the STAR had a slower cycle time than the 7600 (50 ns vs 27.5 ns). So the vector length needed for the STAR to run faster than the 7600 occurred at about 50 data points; if the loops were working on data sets smaller than that, the cost of setting up the vector pipeline was higher than the savings you would get in return.
When the machine was released in 1974, it quickly became apparent that the general performance was nowhere near what people expected. Very few programs can be effectively vectorized into a series of single instructions; nearly all calculations will rely on the results of some earlier instruction, yet the results had to clear the pipelines before they could be fed back in. This forced most programs to hit the high setup cost of the vector units, and generally the ones that did "work" were extreme examples. Making matters worse was that the basic scalar performance was sacrificed in order to improve vector performance. Any time that the program had to run basic instructions, the overall performance of the machine dropped dramatically. (See
Amdahl's Law .)Two STAR-100 systems were eventually delivered to the
Lawrence Livermore National Laboratory . In preparation for the STAR deliveries, LLNL programmers developed a library of subroutines, called "STACKLIB", on the 7600 to emulate the vector operations of the STAR. In the process of developing STACKLIB, it was noticed that STACKLIB-based applications could run even faster on the 7600 than they had prior to the integration of the vector library. This discovery placed further pressures on the performance problems of the STAR.The STAR-100 was a disappointment to everyone involved, and Jim Thornton, the chief designer, left CDC to form Network Systems Corporation. An updated version was later released as the CDC Cyber 203, and then a greatly improved Cyber 205, but by this point the
Cray-1 was on the market with considerably higher performance. The failure of the STAR led to CDC being pushed from its former dominance in the supercomputer market, something they tried to address with the formation ofETA Systems in 1982.
Wikimedia Foundation. 2010.