Internal sort

Internal sort

An internal sort is any data sorting process that takes place entirely within the main memory of a computer. This is possible whenever the data to be sorted is small enough to all be held in the main memory. For sorting larger datasets, it may be necessary to hold only a chunk of data in memory at a time, since it wont all fit. The rest of the data is normally held on some larger, but slower medium, like a hard-disk. Any reading or writing of data to and from this slower media can slow the sortation process considerably. This issue has implications for different sort algorithms.

Consider a Bubblesort, where adjacent records are swapped in order to get them into the right order, so that records appear to 'bubble' up and down through the dataspace. If this has to be done in chunks, then when we have sorted all the records in chunk 1, we move on to chunk 2, but we find that some of the records in chunk 1 need to 'bubble through' chunk 2, and vice versa (i.e., there are records in chunk 2 that belong in chunk 1, and records in chunk 1 that belong in chunk 2 or later chunks). This will cause the chunks to be read and written back to disk many times as records cross over the boundaries between them, resulting in a considerable degradation of performance. If the data can all be held in memory as one large chunk, then this performance hit is avoided.

On the other hand, some algorithms handle external sorting rather better. A Merge sort breaks the data up into chunks, sorts the chunks by some other algorithm (maybe bubblesort or Quick sort) and then recombines the chunks two by two so that each recombined chunk is in order. This approach minimises the number or reads and writes of data-chunks from disk, and is a popular external sort method.


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • Internal reconstruction — is a method of recovering information about a language s past from the characteristics of the language at a later date. Whereas the comparative method compares variations between languages such as in sets of cognates under the assumption that… …   Wikipedia

  • Internal ballistics — Internal ballistics, a subfield of ballistics, is the study of a projectile s behavior from the time its propellant s igniter is initiated until it exits the gun barrel. The study of internal ballistics is important to designers and users of… …   Wikipedia

  • Sort code — A sort code is a number which is assigned to a branch of a bank for internal purposes. Banks use sort codes as it is easier than writing the full address of the branch out and it tells customers which branch they are at. Also, the sort code(s)… …   Wikipedia

  • Radix sort — In computer science, radix sort is a sorting algorithm that sorts integers by processing individual digits. Because integers can represent strings of characters (e.g., names or dates) and specially formatted floating point numbers, radix sort is… …   Wikipedia

  • Counting sort — In computer science, counting sort is an algorithm for sorting a collection of objects according to keys that are small integers; that is, it is an integer sorting algorithm. It operates by counting the number of objects that have each distinct… …   Wikipedia

  • UnShuffle sort — is a sort algorithm. IntroductionThe UnShuffle Sort is a distribution or merge sort which was developed by Art S. Kagel. UnShuffle is most efficient when sorting data which exhibits low entropy, in effect where the data is well ordered or… …   Wikipedia

  • Foreign internal defense — (FID) is used by a number of Western militaries, explicitly by the United States but sharing ideas with countries including France and the United Kingdom, to describe an approach to combating actual or threatened insurgency in a foreign state… …   Wikipedia

  • Business internal dialing — refers to the dial structure within a single business entity, and is sometimes referred to as Virtual Private Numbering (VPN) or Private Numbering Plan (PNP).This structure allows users a shorter number sequence for numbers within the same… …   Wikipedia

  • List of terms relating to algorithms and data structures — The [http://www.nist.gov/dads/ NIST Dictionary of Algorithms and Data Structures] is a reference work maintained by the U.S. National Institute of Standards and Technology. It defines a large number of terms relating to algorithms and data… …   Wikipedia

  • Список терминов, относящихся к алгоритмам и структурам данных —   Это служебный список статей, созданный для координации работ по развитию темы.   Данное предупреждение не устанавливается на информационные списки и глоссарии …   Википедия

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”