- Byte
A byte (pronounced "bite", IPAEng|baɪt) is the basic unit of measurement of information storage in
computer science . In manycomputer architecture s it is a unit of memory addressing, most often consisting of eightbit s. A byte is one of the basicintegral data type s in someprogramming language s, especiallysystem programming language s.A byte is an ordered collection of bits, with each bit denoting a single
binary value of 1 or 0. The size of a byte can vary and is generally determined by the underlying computeroperating system or hardware, although the 8-bit byte is the standard in modern systems. Historically, byte size was determined by the number of bits required to represent a single character from a Westerncharacter set . Its size was generally determined by the number of possible characters in the supported character set and was chosen to be a divisor of the computer's word size. Historically bytes have ranged from five to twelve bits.The popularity of IBM's
System/360 architecture starting in the 1960s and the explosion ofmicrocomputer s based on 8-bitmicroprocessor s in the 1980s has made eight bits by far the most common size for a byte. The term octet is widely used as a more precise synonym where ambiguity is undesirable (for example, in protocol definitions).There has been considerable confusion about the meanings of metric -- or
SI prefixes -- used with the word "byte", especially concerning prefixes such as kilo- (k or K) and mega- (M) as shown in the chart "Prefixes for bit and byte". Since computer memory comes in aPower of two rather than 10, a large portion of the software and computer industry use binary estimates of the SI-prefixed quantities, while producers of computer storage devices prefer the SI values. This is why a computer hard drive advertised with a "100 GB" decimal storage capacity actually contains no more than 93 GB of 8-bit (power of 2) addressable storage. Because of the confusion, a contract specifying a quantity of bytes must define what the prefixes mean in terms of the contract (i.e., the alternative binary equivalents or the actual decimal values, or a binary estimate based on the actual values).To make the meaning of the table absolutely clear: A
kibibyte (KiB) is made up of 1,024 bytes. Amebibyte (MiB) is made up of 1,024 × 1,024 i.e. 1,048,576 bytes. The figures in the column using 1,024 raised to powers of 1, 2, 3, 4 and so on are in units of bytes.Meanings
The word "byte" has two closely related meanings:
# A contiguous sequence of a "fixed" number ofbit s (binary digits). The use of a byte to mean 8 bits has become nearly ubiquitous.
# A contiguous sequence of bits within a binary computer that comprises the "smallest addressable sub-field" of the computer's natural word-size. That is, the smallest unit of binary data on which meaningful computation, or natural data boundaries, could be applied. For example, theCDC 6000 series scientific mainframes divided their 60-bit floating-point words into 10 six-bit bytes. These bytes conveniently heldHollerith data from punched cards, typically the upper-case alphabet and decimal digits. CDC also often referred to 12-bit quantities as bytes, each holding two 6-bitdisplay code characters, due to the 12-bit I/O architecture of the machine. ThePDP-10 used assembly instructions LDB and DPB to extract bytes — these operations survive today inCommon Lisp . Bytes of six, seven, or nine bits were used on some computers, for example within the 36-bit word of thePDP-10 . TheUNIVAC 1100/2200 series computers (nowUnisys ) addressed in both 6-bit (Fieldata ) and 9-bit (ASCII ) modes within its 36-bit word.History
The term byte was coined by Dr. Werner Buchholz in July 1956, during the early design phase for the IBM Stretch computer. [ [http://www.trailing-edge.com/~bobbemer/BYTE.HTM Origins of the Term "BYTE"] Bob Bemer, accessed 2007-08-12] [ [http://archive.computerhistory.org/resources/text/IBM/Stretch/102636400.txt TIMELINE OF THE IBM STRETCH/HARVEST ERA (1956-1961)] computerhistory.org, '1956 July ... Werner Buchholz ... Werner's term "Byte" first popularized'] [ [http://catb.org/~esr/jargon/html/B/byte.html byte] catb.org, 'coined by Werner Buchholz in 1956'] Originally it was defined in instructions by a 4-bit byte-size field, allowing from one to sixteen bits (the production design reduced this to a 3-bit byte-size field, allowing from one to eight bits to be represented by a byte); typical I/O equipment of the period used six-bit bytes. A fixed eight-bit byte size was later adopted and promulgated as a standard by the
System/360 . The term "byte" comes from "bite," as in the smallest amount of data a computer could "bite" at once. The spelling change not only reduced the chance of a "bite" being mistaken for a "bit," but also was consistent with the penchant of early computer scientists to make up words and change spellings. A byte was also often referred to as "an 8-bit byte", reinforcing the notion that it was a tuple of "n" bits, and that other sizes were possible.# A contiguous sequence of binary bits in a serial data stream, such as in modem or satellite communications, or from a disk-drive head, which is the smallest meaningful unit of data. These bytes might include start bits, stop bits, or parity bits, and thus could vary from 7 to 12 bits to contain a single 7-bit ASCII code.
# A "datatype " or synonym for a datatype in certainprogramming language s. C and C++, for example, defines "byte" as "addressable unit of data storage large enough to hold any member of the basic character set of the execution environment" (clause 3.6 of the C standard). Since the Cchar
integral data type must contain at least 8 bits (clause 5.2.4.2.1), a byte in C is at least capable of holding 256 different values (signed or unsignedchar
does not matter). Various implementations of C and C++ define a "byte" as 8, 9, 16, 32, or 36 bits [ [http://www.parashift.com/c++-faq-lite/intrinsic-types.html#faq-26.4 [26 Built-in / intrinsic / primitive data types, C++ FAQ Lite ] ] [ [http://home.att.net/~jackklein/c/inttypes.html#char Integer Types In C and C ] ] . The actual number of bits in a particular implementation is documented asCHAR_BIT
as implemented in the
file. Java's primitivelimits.h byte
data type is always defined as consisting of 8 bits and being a signed data type, holding values from −128 to 127.Early microprocessors, such as
Intel 8008 (the direct predecessor of the 8080, and then 8086) could perform a small number of operations on four bits, such as the DAA (decimal adjust) instruction, and the "half carry" flag, that were used to implement decimal arithmetic routines. These four-bit quantities were called "nybble s," in homage to the then-common 8-bit "bytes."Alternative words
Following "bit," "byte," and "nybble," there have been some analogical attempts to construct unambiguous terms for bit blocks of other sizes. [ [http://dictionary.reference.com/browse/nybble nybble] reference.com sourced from Jargon File 4.2.0, accessed 2007-08-12] All of these are strictly
jargon , and not very common.* 2 bits: crumb, quad, quarter,
tayste ,tydbit
* 4 bits:nibble , nybble
* 5 bits: nickel, nyckle
* 10 bits:deckle
* 16 bits: plate, playte, chomp,chawmp (on a 32-bit machine)
* 18 bits: chomp, chawmp (on a 36-bit machine)
* 32 bits: dinner,dynner ,gawble (on a 32-bit machine)
* 48 bits:gobble , gawble (under circumstances that remain obscure)Abbreviation/Symbol
IEEE 1541 and [http://swiss.csail.mit.edu/~jaffer/MIXF Metric-Interchange-Format] specify "B" as the symbol for byte (e.g. MB means megabyte), whileIEC 60027 seems silent on the subject.Furthermore, B means bel (seedecibel ), another (logarithmic) unit used in the same field.The use of B to stand for bel is consistent with the metric system convention that capitalized symbols are for units named after a person (in this caseAlexander Graham Bell ); usage of a capital B to stand for byte is not consistent with this convention. There is little danger of confusing a byte with a bel because the bel's sub-multiple thedecibel (dB) is usually preferred, while use of the decibyte (dB) is extremely rare.The unit symbol "kb" with a lowercase "b" is a commonly used abbreviation for "kilobyte". Use of this abbreviation leads to confusion with the alternative use of "kb" to mean "
kilobit ". IEEE 1541 specifies "b" as the symbol forbit ; however the IEC 60027 and Metric-Interchange-Format specify "bit" (e.g. Mbit for megabit) for the symbol, achieving maximum disambiguation from byte.French-speaking countries sometimes use an uppercaseFact|date=June 2008 "o" for "octet". This is not consistent with
SI because of the risk of confusion with the zero, and the convention that capitals are reserved for unit names derived from proper names, such as theampere (whose symbol is A) andjoule (symbol J), versus thesecond (symbol s) andmetre (symbol m).Lowercase "o" for "octet" is a commonly used symbol in several non-English-speaking countries, and is also used with metric prefixes (for example, "ko" and "Mo").
Today the harmonized
ISO /IEC standard cancels and replaces subclauses 3.8 and 3.9 of IEC 60027-2:2005 (those related to Information theory and Prefixes for binary multiples).SeeUnits of information#Byte for detailed discussion on names for derived units.ee also
*
Bit
*Word (computing) Notes
Wikimedia Foundation. 2010.