Plain text

Plain text

In computing, plain text is a term used for an ordinary "unformatted" sequential file readable as textual material without much processing.

The encoding has traditionally been either ASCII, one of its many derivatives such as ISO/IEC 646 etc., or sometimes EBCDIC. No other encodings are used in plain text files which neither contain any (character-based) structural tags such as heading marks, nor any typographic markers like bold face, italics, etc.

Unicode is today gradually replacing the older ASCII derivatives limited to 7 or 8 bit codes. It will probably serve much the same purposes, but this time permitting almost any human language as well as important punctuation and symbols such as mathematical relations (≠ ≤ ≥ ≈), multiplication (× •), etc, which are not included in the very rudimentary and incomplete ASCII set.

Usage

The purpose of using "plain text" today is primarily a "lowest common denominator" independence from programs that require their very own special encoding or formatting (with due sacrifices and limitations). Plain text files can be opened, read, and edited with most text editors. Examples include Notepad (Windows), edit (DOS), ed, vi or vim (Unix, Linux), SimpleText (Mac OS), or TextEdit (Mac OS X). Other computer programs are also capable of reading and importing plain text.It can also be used by simple computer tools such as line printing text commands like type (DOS and Windows) and cat (Unix).

Plain text files are almost universal in programming; a source code file containing instructions in a programming language is almost always a plain text file. Plain text is also commonly used for configuration files, who were read for saved settings at the startup of a program.

Related terms

The related term, plaintext, is most commonly used in a cryptographic context, while cleartext usually refers to lack of protection from eavesdropping. Usage of these terms is such that there is some confusion amongst them, especially among those new to computers, cryptography, or data communications.

Philosophy

This reveals that plain text is in fact the technical user's "way to regard" a file or a sequence of bytes. In this sense, there is no plain text, since bits are stored as states of latches, charges on transistor gates, microscopic magnetic or mechanical dots on a disk, etc, and humans don't have the senses needed to read this. The information must thus "appear as text" (on screen or on paper) in order to "be text" in this absolute sense of the word.

Plain text is a way to represent generic text without attributes such as fonts, subscripts, and boldface; due to this simplicity, it is readable and processable by almost "any" computer program. In a way a HTML, SGML and an XML file "is regarded as" plain text, since no control codes (see below) are used, but real structural tags are actually included in these formats. As regards to the SGML and XML author, these tags are "human readable" since that format author understands the structure by reading the format. This may illuminate the complications of the usage of terms within computer science: it's all a relative view point.

Encoding

Character encodings

Text was once commonly encoded in ASCII, using 8 bits for one letter or other character, encoding 7 bits, allowing 128 values, and using the 8th as a checksum bit when transferring a file. This just allowed the ordinary Latin alphabet, transfer control codes, parentheses and interpunction, which annoyed especially Portuguese and SwedishFact|date=May 2008 computer users. Therefore, when data transfer became more stable, the remaining 128 values were encoded, everywhere differently, and in a way that made multilingual texts impossible to encode. At last Unicode was defined, which currently allows for 1,114,112 code values used for any modern text writing system, and a lot of extinct ones. For example Unicode codes Chinese, Hebrew, Cyrillic as well as Latin. Some of these text formats may be pretty complicated to process correctly, but they still contain no structural data, such as bold start and end markers, and are therefore plain text.

Control codes

The ASCII codes before SPACE (= 32 = 20H) are not intended as displayable characters, but instead as control characters. They are used for a diversity of interpreted meanings, for example the code NULL (= 0, sometimes denoted Ctrl-@) is used as string end markers in the programming language C and successors. Most troublesome of these are the codes LF (= LINE FEED = 10 = 0AH) and CR (= CARRIAGE RETURN = 13 = 0DH). Windows and OS/2 require the sequence CR,LF to represent a newline, while Unix and relatives uses just the LF, and Classic Mac OS (but not Mac OS X) uses just the code CR. This was once a slight problem when transferring files between Windows and Unices, but today most computer programs treat this seamlessly.

ee also

* E-text
* MIME Content-type
*Formatted text
*Filename extension
*File format
*Binary file
*Text file
*Editor wars
*File system
*Configuration file
*Source code


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • plain text — noun (computing) Text that is readable and uncoded • • • Main Entry: ↑plain * * * plain text UK US noun [countable/uncountable] [singular plain text plural …   Useful english dictionary

  • plain|text — «PLAYN TEHKST», noun. 1. the text of any message that conveys an intelligible meaning in the language in which it is written, having no hidden meaning. 2. the intelligible text intended for, or derived from, a cryptogram: »Cryptography aims at… …   Useful english dictionary

  • plain text — plain ,text noun count or uncount COMPUTING writing in a computer file that has no special CODES and can therefore be used easily by other computer programs …   Usage of the words and phrases in modern English

  • Plain text — Mit Plain text (engl. für Klartext) werden Daten bezeichnet, die direkt unter Verwendung einer Zeichenkodierung in Text umgesetzt werden können. Zudem stellt dieser Text die eigentliche Information dar, das heißt zur Interpretation der Daten ist… …   Deutsch Wikipedia

  • Plain Text — Mit Plain text (engl. für Klartext) werden Daten bezeichnet, die direkt unter Verwendung einer Zeichenkodierung in Text umgesetzt werden können. Zudem stellt dieser Text die eigentliche Information dar, das heißt zur Interpretation der Daten ist… …   Deutsch Wikipedia

  • plain text — 1. noun a) Unencrypted text, text that is readable. Using the sophisticated code was useless since the spy merely stole the plain text from the waste basket. b) Data which consists only of human readable text, as opposed to machine readable… …   Wiktionary

  • plain text — UK / US noun [countable/uncountable] Word forms plain text : singular plain text plural plain texts computing writing in a computer file that has no special codes and can therefore be used easily by other computer programs …   English dictionary

  • plain text — grynasis tekstas statusas T sritis informatika apibrėžtis Tekstas, sudarytas vien iš ↑rašmenų, tarp kurių gali būti ↑eilučių skirtukų ir ↑tabuliavimo ženklų. atitikmenys: angl. plain text ryšiai: dar žiūrėk – eilučių skirtukas dar žiūrėk –… …   Enciklopedinis kompiuterijos žodynas

  • plain text format — noun Computers an interchange file format in standard ASCII characters to enable the exchange of documents between document preparation systems, but without text formatting. Compare rich text format. Also, (in filenames), .txt …  

  • Plain text — Fichier texte En informatique, un fichier texte ou fichier texte brut ou fichier texte simple ou fichier ASCII, est un fichier dont le contenu représente uniquement une suite de caractères imprimable d espace et de retour à la ligne. Un fichier… …   Wikipédia en Français

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”