Disk formatting

Disk formatting
Formatting a hard drive using MS-DOS

Disk formatting is the process of preparing a hard disk drive or flexible disk medium for data storage. In some cases, the formatting operation may also create one or more new file systems. The formatting process that performs basic medium preparation is often referred to as "low-level formatting." The term "high level formatting" most often refers to the process of generating a new file system. In certain operation systems (e.g., Microsoft Windows), the two processes[clarification needed] are combined and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store files. Illustrated to the right are the prompts and diagnostics printed by MS-DOS's FORMAT.COM utility as a hard drive is being formatted.

As a general rule, formatting a disk is "destructive," in that existing data (if any) is lost during the process.

Contents

History

A "block", a contiguous number of bytes, is the unit of memory that is read from and written to a disk by a disk driver. The earliest disk drives had fixed block sizes (e.g. the IBM 350 disk storage unit (of the late 1950's) block size was 100 6 bit characters) but starting with the 1301[1] IBM marketed subsystems that featured variable block sizes - a particular track could have blocks of different sizes. The disk subsystems on the IBM System/360 expanded this concept in the form of Count Key Data (CKD) and later Extended Count Key Data (ECKD); however the use of variable block size in HDDs fell out of use in the 1990s; one of the last HDDs to support variable block size was the IBM 3390 Model 9, announced May 1993[2]

Modern hard disk drives such as Serial attached SCSI (SAS)[3] and Serial ATA (SATA)[4] drives, appear at their interfaces as a contiguous set of fixed-size blocks; typically 512 bytes long but the industry is in the process of changing to 4,096 byte logical blocks.[5]

Floppy disks generally only used fixed block sizes but these sizes were a function of the host's OS and its interaction with its controller so that a particular type of media (e.g., 5¼-inch DSDD) would have different block sizes depending upon the host OS and controller.

Optical disks generally only use fixed block sizes.

Disk formatting process

Formatting a disk for use by an operating system and its applications involves three different steps.

  1. Low-level formatting (i.e., closest to the hardware) marks the surfaces of the disks with markers indicating the start of a recording block (typically today called sector markers) and other information to be used later, in normal operations, by the disk controller to read or write data. This is intended to be the permanent foundation of the disk, and is often completed at the factory.
  2. Partitioning creates data structures needed by the operating system. This level of formatting often includes checking for defective tracks or defective sectors.
  3. High-level formatting creates the file system format within the structure of the intermediate-level formatting This formatting includes the data structures used by the OS to identify the logical drive or partition's contents). This may occur during operating system installation, or when adding a new disk. Disk and distributed file system may specify an optional boot block, and/or various volume and directory information for the operating system.

Low-level formatting of floppy disks

The low-level format of floppy disks (and early hard disks) is performed by the disk drive's controller.

Consider a standard 1.44 MB floppy disk. Low-level formatting of the floppy disk, normally writes 18 sectors of 512 bytes to each of 160 tracks (80 on each side) of the floppy disk, providing 1,474,560 bytes of storage on the disk.

Physical sectors are actually larger than 512 bytes, as in addition to the 512 byte data field they include a sector identifier field, CRC bytes (in some cases error correction bytes) and gaps between the fields. These additional bytes are not normally included in the quoted figure for overall storage capacity of the disk.

Different low-level formats can be used on the same media; for example, large records can be used to cut down on inter-record gap size.

Several freeware, shareware and free software programs (e.g. GParted, FDFORMAT, NFORMAT and 2M) allowed considerably more control over formatting, allowing the formatting of high-density 3.5" disks with a capacity up to 2 MB.

Techniques used include:

  • head/track sector skew (moving the sector numbering forward at side change and track stepping to reduce mechanical delay),
  • interleaving sectors (to minimize sector gap and thereby allowing the number of sectors per track to be increased),
  • increasing the number of sectors per track (while a normal 1.44 MB format uses 18 sectors per track, it is possible to increase this to a maximum of 21), and
  • increasing the number of tracks (most drives could tolerate extension to 82 tracks – though some could handle more, others could jam).

Linux supports a variety of sector sizes, and DOS and Windows support a large-record-size DMF-formatted floppy format.[citation needed]

Low-level formatting (LLF) of hard disks

Low-level format of a 10-megabyte IBM PC XT hard drive.

Hard disk drives prior to the 1990s typically had a separate disk controller that defined how data was encoded on the media. With the media, the drive and/or the controller possibly procured from separate vendors, low level formatting was a potential user activity. Separate procurement also had the potential of incompatibility between the separate components such that the subsystem would not reliably store data.[6]

User instigated low-level formatting (LLF) of hard disk drives was common for minicomputer and personal computer systems until the 1990s. IBM and other mainframe system vendors typically supplied their hard disk drives (or media in the case of removable media HDDs) with a low-level format. Typically this involved subdividing each track on the disk into one or more blocks which would contain the user data and associated control information. Different computers used different block sizes and IBM notably used variable block sizes but the popularity of the IBM PC caused the industry to adopt a standard of 512 user data bytes per block by the middle 1980s.

Depending upon the system, low-level formatting was generally done by an operating system system utility. IBM compatible PCs used the BIOS which is involved using the MS-DOS debug program to transfer control to a routine hidden at different addresses in different BIOSs.[7] Low-level format function can also be called as "erase" or "wipe" in different tools. For best results it's highly recommended to use tools created by hard disk's manufacturer.

Transition away from LLF

Starting in the late 1980s, driven by the volume of IBM compatible PCs, HDDs became routinely available pre-formatted with a compatible low-level format. At the same time, the industry moved from historical (dumb) bit serial interfaces to modern (intelligent) bit serial interfaces and Word serial interfaces wherein the low level format was performed at the factory.

Today, an end-user, in most cases, should never perform a low-level formatting of an IDE or ATA hard drive, and in fact it is often not possible to do so on modern hard drives outside of the factory.[8][9]

Disk reinitialization

While it is generally impossible to perform a complete LLF on most modern hard drives (since the mid-1990s) outside the factory,[10] the term "low-level format" is still used for what could be called the reinitialization of a hard drive to its factory configuration (and even these terms may be misunderstood). Reinitialization should include identifying (and sparing out if possible) any sectors which cannot be written to and read back from the drive, correctly. The term has, however, been used by some to refer to only a portion of that process, in which every sector of the drive is written to; usually by writing a zero byte to every addressable location on the disk, sometimes called zero-filling.

The present ambiguity in the term low-level format seems to be due to both inconsistent documentation on web sites and the belief by many users that any process below a high-level (file system) format must be called a low-level format. Since much of the low level formatting process can today only be performed at the factory, various drive manufacturers describe reinitialization software as LLF utilities on their web sites. Since users generally have no way to determine the difference between a complete LLF and reinitialization (they simply observe running the software results in a hard disk that must be high-level formatted), both the misinformed user and mixed signals from various drive manufacturers have perpetuated this error. Note: Whatever possible misuse of such terms may exist (search hard drive manufacturers' web sites for all these terms), many sites do make such reinitialization utilities available (possibly as bootable floppy diskette or CD image files), to both overwrite every byte and check for damaged sectors on the hard disk.

One popular method for performing only the zero-fill operation on a hard disk is by writing zero-value bytes to the drive using the Unix dd utility with the /dev/zero stream as the input file and the drive itself or a specific partition as the output file.

Another method for SCSI disks may use the sg_format[11] command to issue a low level SCSI FORMAT UNIT command.

Partitioning

Partitioning is the process of writing information into blocks of a storage device or medium that allows access by an operating system. Some operating systems allow the device (or its medium) to appear as multiple devices; i.e. partitioned into multiple devices.

On PC and UNIX-based operating systems (such as BSD, Linux/GNU, Mac OSX) this is normally done with a Partition editor, e.g., fdisk, LVM, parted. These operating systems support multiple partitions.

In current IBM mainframe OSs derived from S/360 OSs, this is done by the INIT command of the ICKDSF utility;[12] these legacy OSs support only a single partition per device, called a volume. The ICKDSF functions include creating a volume label and writing a Record 0 on every track.

Floppy disks are not partitioned; however depending upon the OS they may require volume information in order to be accessed by the OS.

Partition editors and ICKDSF today do not handle low level functions for HDDs and optical disk drives such as writing timing marks, and they cannot reinitialize a modern disk that has been degaussed or otherwise lost the factory formatting.

High-level formatting

High-level formatting is the process of setting up an empty file system on the disk and, for PC's, installing a boot sector. This is a fast operation, and is sometimes referred to as quick formatting.

The entire logical drive or partition may optionally be scanned for defects, which may take considerable time.

In the case of floppy disks, both high- and low-level formatting are customarily performed in one pass by the disk formatting software. In recent years,[when?] most floppies have shipped pre-formatted from the factory as DOS FAT12 floppies.

In current IBM mainframes derived from S/360, this may done as part of allocating a file, by a utility specific to the file system or, in some older access methods, on the fly as new data are written.

Host protected area

The host protected area, sometimes referred to as hidden protected area,[13] is an area of a hard drive that is high level formatted so that the area is not normally visible to its operating system (OS).

Reformatting

Reformatting is a high-level formatting performed on a functioning disk drive to free the contents of its medium. Reformatting is unique to each operating system because what actually is done to existing data varies by OS. The most important aspect of the process is that it frees disk space for use by other data. To actually "erase" everything requires overwriting each block of data on the medium; something that is not done by many PC high-level formatting utilities.

Reformatting often carries the implication that the operating system and all other software will be reinstalled after the format is complete. Rather than fixing an installation suffering from malfunction or security compromise, it is sometimes judged easier to simply reformat everything and start from scratch. Various colloquialism exist for this process, such as "wipe and reload", "nuke and pave", "reimage", etc.

Formatting

DOS, OS/2 and Windows

MS-DOS 6.22a FORMAT /U switch failing to overwrite content of partition.

Under MS-DOS, PC-DOS, OS/2 and Microsoft Windows, disk formatting can be performed by the format command. The format program usually asks for confirmation beforehand to prevent accidental removal of data, but some versions of DOS have an undocumented /AUTOTEST option; if used, the usual confirmation is skipped and the format begins right away. The WM/FormatC macro virus uses this command to format the C: drive as soon as a document is opened.

There is also the undocumented /U parameter that performs an unconditional format which under most circumstances overwrites the entire partition,[14] preventing the recovery of data through software. Note however that the /U switch only works reliably with floppy diskettes (technically because unless /Q is used, floppies are always low-level formatted in addition to high-level formatted). Under certain circumstances with hard drive partitions, however, the /U switch merely prevents the creation of unformat information in the partition to be formatted while otherwise leaving the partition's contents entirely intact (still on disk but marked deleted). In such cases, the user's data remain ripe for recovery with specialist tools such as EnCase or disk editors. Reliance upon /U for secure overwriting of hard drive partitions is therefore inadvisable, and purpose-built tools such as DBAN should be considered instead.

Under OS/2, if you use the /L parameter, which specifies a long format, then format will overwrite the entire partition or logical drive. Doing so enhances the ability of CHKDSK to recover files.

Unix-like operating systems

High-level formatting of disks on these systems is traditionally done using the mkfs command. On Linux (and potentially other systems as well) mkfs is typically a wrapper around filesystem-specific commands which have the name mkfs.fsname, where fsname is the name of the filesystem with which to format the disk.[15] Some filesystems which are not supported by certain implementations of mkfs have their own manipulation tools; for example Ntfsprogs provides a format utility for the NTFS filesystem.

Some Unix and Unix-like operating systems have higher-level formatting tools, usually for the purpose of making disk formatting easier and/or allowing the user to partition the disk with the same tool. Examples include GNU Parted (and its various GUI frontends such as GParted and the KDE Partition Manager) and the Disk Utility application on Mac OS X.

Recovery of data from a formatted disk

As in file deletion by the operating system, data on a disk are not fully erased during every[16] high-level format. Instead, the area on the disk containing the data is merely marked as available, and retains the old data until it is overwritten. If the disk is formatted with a different file system than the one which previously existed on the partition, some data may be overwritten that wouldn't be if the same file system had been used. However, under some file systems (e.g., NTFS, but not FAT), the file indexes (such as $MFTs under NTFS, inodes under ext2/3, etc.) may not be written to the same exact locations. And if the partition size is increased, even FAT file systems will overwrite more data at the beginning of that new partition.

From the perspective of preventing the recovery of sensitive data through recovery tools, the data must either be completely overwritten (every sector) with random data before the format, or the format program itself must perform this overwriting, as the DOS FORMAT command did with floppy diskettes, filling every data sector with the byte value F6 in hex.

See also

References

  1. ^ "IBM 1301 disk storage unit". IBM. http://www-03.ibm.com/ibm/history/exhibits/storage/storage_1301.html. Retrieved 2010-06-24. 
  2. ^ IBM 3390 direct access storage device
  3. ^ "The LBAs on a logical unit shall begin with zero and shall be contiguous up to the last logical block on the logical unit.", Information technology - Serial Attached SCSI - 2 (SAS-2), INCITS 457 Draft 2, May 8, 2009, chapter 4.1 Direct-access block device type model overview.
  4. ^ ISO/IEC 791D:1994, AT Attachment Interface for Disk Drives (ATA-1), section 7.1.2
  5. ^ Western Digital’s Advanced Format: The 4K Sector Transition Begins
  6. ^ This problem became common in PCs where users used RLL controllers with MFM drives; "MFM drives should not be used on RLL controllers. ..."
  7. ^ Using DEBUG to Start a Low-Level Format, Microsoft
  8. ^ The NOSPIN Group, Inc. (n.d.). Low level formatting an IDE hard drive. Retrieved December 24, 2003.
  9. ^ The PC Guide. Site Version: 2.2.0 - Version Date: April 17, 2001 Low-Level Format, Zero-Fill and Diagnostic Utilities. Retrieved May 24, 2007.
  10. ^ Many enterprise class HDDs can be low-level formatted to block sizes other than 512 bytes, e.g. Seagate SAS drives support sector sizes of 512, 520, 524 or 528 bytes and can reformatted from one size to another
  11. ^ SG.danny.cz
  12. ^ Publibz.boulder.ibm.com
  13. ^ Hidden Protected Area - ThinkWiki
  14. ^ "AXCEL216 / MDGx MS-DOS Undocumented + Hidden Secrets". http://www.mdgx.com/secrets.htm#FORMAT-U. Retrieved 2008-06-07. 
  15. ^ "mkfs(8) - Linux man page". http://linux.die.net/man/8/mkfs. Retrieved 2010-04-25. 
  16. ^ Data are destroyed in PC operating systems when the /L (long) option is used on format, for a Partitioned Data Set (PDS) in MVS and for newer file systems on IBM mainframes.

External links


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • Disk sector — Figure 1. Disk structures: (A) Track (B) Geometrical sector (C) Track sector (D) Cluster In computer disk storage, a sector is a subdivision of a track on a magnetic disk or optical disc. Each sector stores a fixed amount of user data.… …   Wikipedia

  • Disk partitioning — GParted is a popular utility used for disk partitioning Disk partitioning is the act of dividing a hard disk drive into multiple logical storage units referred to as partitions, to treat one physical disk drive as if it were multiple disks.… …   Wikipedia

  • Disk Utility — Developer(s) Apple Inc …   Wikipedia

  • Disk storage — or disc storage is a general category of storage mechanisms, in which data are digitally recorded by various electronic, magnetic, optical, or mechanical methods on a surface layer deposited of one or more planar, round and rotating disks (or… …   Wikipedia

  • formatting — format format 2 verb formatted PTandPP formatting PRESPART [transitive] 1. COMPUTING to put an instruction into a computer in order to prepare a disk so that information can be …   Financial and business terms

  • formatting — for·mat·ting || fÉ”rmætɪŋ / fɔːm n. (Computers) preparation of a diskette or hard disk for reading and writing; design, setting of configurations of a document for·mat || fÉ”rmæt / fɔːm n. structure, pattern, design; organization;… …   English contemporary dictionary

  • formatting —    The process of initializing a new, blank floppy disk or hard disk so that it can be used to store information …   Dictionary of networking

  • Hard disk drive — Hard drive redirects here. For other uses, see Hard drive (disambiguation). Hard disk drive Mechanical interior of a modern hard disk drive Date invented 24 December 1954 [1] …   Wikipedia

  • Floppy disk — Floppy redirects here. For other uses, see Floppy (disambiguation). 8 inch, 5 1⁄4 inch, and 3 1⁄2 inch floppy disks …   Wikipedia

  • Floppy disk format — and density refer to the logical and physical layout of data stored on a floppy disk. Since their introduction, there have been many popular and rare floppy disk types, densities, and formats used in computing, leading to much confusion over… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”