Forth (programming language)

Forth (programming language)

infobox programming language
name = Forth
paradigm = Procedural, stack-oriented
year = 1970s
designer = Charles H. Moore
typing = typeless
dialects = colorForth, Open Firmware
implementations = Forth, Inc., GNU Forth, MPE
influenced_by = Burroughs large systems, Lisp, APL
influenced = Factor, Joy, Cat

Forth is a structured, imperative, stack-based, computer programming language and programming environment. Forth is sometimes spelled in all capital letters following the customary usage during its earlier years, although the name is not an acronym.

A procedural, stack-oriented and reflective programming language without type checking, Forth features both interactive execution of commands (making it suitable as a shell for systems that lack a more formal operating system) and the ability to compile sequences of commands for later execution. Some Forth implementations (usually early versions or those written to be extremely portable) compile threaded code, but many implementations today generate optimized machine code like other language compilers.

Overview

A Forth environment combines the compiler with an interactive shell. The user interactively defines and runs subroutines, or "words," in a virtual machine similar to the runtime environment. Words can be tested, redefined, and debugged as the source is entered without recompiling or restarting the whole program. All syntactic elements, including variables and basic operators, appear as such procedures. Even if a particular word is optimized so as not to require a subroutine call, it is also still available as a subroutine. On the other hand, the shell may compile interactively typed commands into machine code before running them. (This behavior is common, but not required.) Forth environments vary in how the resulting program is stored, but ideally running the program has the same effect as manually re-entering the source. This contrasts with the combination of C with Unix shells, wherein compiled functions are a special class of program objects and interactive commands are strictly interpreted. Most of Forth's unique properties result from this principle. As a "jack of all trades" including interaction, scripting, and compilation, Forth was popular on computers with limited resources, such as the BBC Micro and Apple II series, and remains so in applications such as firmware and small microcontrollers. In this way, Forth is broadly comparable to BASIC, but emphasizing optimization over ease of use. Where C compilers may now generate code with more compactness and performance, Forth retains the advantage of interactivity.

Certain words are predefined, including the basic arithmetic operators. Predefined words which may be used at runtime are collectively called the "kernel." When the user starts an interactive Forth environment, it typically consists of the kernel and the interpreter. The compiler comprises a set of commands within the interpreter. Most of the ANS Forth standard is devoted to defining the contents of the kernel and the interface to the compiler. Aside from the predefined words, the main requirement is that words are executed in sequence. Here is an example of a programmer "developing" the hello world program. By convention, source code examples are written as a log, showing the programmer's input and the interpreter's response. Interpreters traditionally say "OK" after the successful completion of a command line.

.( Hello, World!) Hello, World! ok : hello-world ." Hello, World!" ; ok hello-world Hello, World! ok

In the first line, the text output operator .( is used interactively to immediately produce the output Hello, world!. The space after the ( lets the interpreter parse the operator as a separate word and the ) signals the end of the string. Next, the colon (:) is used to activate the compiler and store the same code, including the string itself, into the word hello-world. Finally, the programmer activates the newly defined word to verify that it produces the expected output. Many Forth systems would allow ." to be used in place of .( for consistency.

The stacks

Every programming environment with subroutines implements a call stack for control flow. This structure typically also stores local variables, including subroutine parameters (in a call by value system such as C). Forth often does not have local variables, however, nor is it call-by-value. Instead, intermediate values are kept in a second stack. Words operate directly on the topmost values in this stack. It may therefore be called the "parameter" or "data" stack, but most often simply "the" stack. The function-call stack is then called the "linkage" or "return" stack, abbreviated "rstack". Special rstack manipulation functions provided by the kernel allow it to be used for temporary storage within a word, but otherwise it cannot be used to pass parameters or manipulate data.

Most words are specified in terms of their effect on the stack. Typically, parameters are placed on the top of the stack before the word executes. After execution, the parameters have been erased and replaced with any return values. For arithmetic operators, this follows the rule of reverse Polish notation. See below for examples illustrating stack usage.

Maintenance

Forth is a simple yet extensible language; its modularity and extensibility permit the writing of high-level programs such as CAD systems. However, extensibility also helps poor programmers to write incomprehensible code, which has given Forth a reputation as a "write-only language". Forth has been used successfully in large, complex projects, while applications developed by competent, disciplined professionals have proven to be easily maintained on evolving hardware platforms over decades of use. [cite web
last =
first =
authorlink =
coauthors =
date =
year =
month =
url = http://www.forth.org/successes.html
title = Forth Success Stories
format =
work =
pages =
publisher =
language =
accessdate = 2006-06-09
] Forth has a niche both in astronomical and space applications. [cite web
last =
first =
authorlink =
coauthors =
date =
year =
month =
url = http://forth.gsfc.nasa.gov/
title = Space Related Applications of Forth
format =
work =
pages =
publisher =
language =
accessdate = 2007-09-04
] Forth is still used today in many embedded systems (small computerized devices) because of its portability, efficient memory use, short development time, and fast execution speed. It has been implemented efficiently on modern RISC processors, and processors that use Forth as machine language have been produced. [cite web
last =
first =
authorlink =
coauthors =
date =
year =
month =
url = http://www.ultratechnology.com/
title = Forth Chips Page
format =
work =
pages =
publisher =
accessdate = 2006-06-09
] Other uses of Forth include the Open Firmware boot ROMs used by Apple, IBM, Sun, and OLPC XO-1; and the [http://ficl.sourceforge.net/ FICL] -based first stage boot controller of the FreeBSD operating system.

History

Forth evolved from Charles H. Moore's personal programming system, which had been in continuous development since 1958.cite web | url=http://www.forth.com/resources/evolution/index.html | title=The Evolution of Forth | coauthors=C. H. Moore, E. D. Rather, and D. R. Colburn | publisher=ACM SIGPLAN History of Programming Languages Conference | month=April | year=1993 | work=ACM SIGPLAN Notices, Volume 28, No. 3. March, 1993] Forth was first exposed to other programmers in the early 1970s, starting with Elizabeth Rather at the US National Radio Astronomy Observatory. After their work at NRAO, Charles Moore and Elizabeth Rather formed FORTH, Inc. in 1973, refining and porting Forth systems to dozens of other platforms in the next decade.

Forth is so named because in 1968 " [t] he file holding the interpreter was labeled FOURTH, for 4th (next) generation software — but the IBM 1130 operating system restricted file names to 5 characters." [ cite web | last = Moore | first = Charles H | year = 1991 | url = http://www.colorforth.com/HOPL.html | title = Forth - The Early Years | format = HTML | accessdate = 2006-06-03 ] Moore saw Forth as a successor to compile-link-go third-generation programming languages, or software for "fourth generation" hardware, not a fourth-generation programming language as the term has come to be used.

Because Charles Moore had frequently moved from job to job over his career, an early pressure on the developing language was ease of porting to different computer architectures. A Forth system has often been used to bring up new hardware. For example, Forth was the first resident software on the new Intel 8086 chip in 1978 and MacFORTH was the first resident development system for the first Apple Macintosh in 1984.

FORTH, Inc's microFORTH was developed for the Intel 8080, Motorola 6800, and Zilog Z80 microprocessors starting in 1976. MicroFORTH was later used by hobbyists to generate Forth systems for other architectures, such as the 6502 in 1978. Wide dissemination finally led to standardization of the language. Common practice was codified in the de facto standards FORTH-79 [cite web | url=https://mywebspace.wisc.edu/lnmaurer/web/forth/Forth-79.pdf | title=The Forth-79 Standard] and FORTH-83 [cite web | url=http://forth.sourceforge.net/standard/fst83/ | title=The Forth-83 Standard] in the years 1979 and 1983, respectively. These standards were unified by ANSI in 1994, commonly referred to as ANS Forth. [ cite web | publisher = ANSI technical committee X3J14 | date = 24 March 1994 | url = http://www.taygeta.com/forth/dpans.html | title = Programming Languages: Forth | format = HTML | accessdate = 2006-06-03 ]

Forth became very popular in the 1980s [Harvard reference | Surname= | Given= | Authorlink= | Title=The Forth Language | Journal=BYTE Magazine | Volume=5 | Issue=8 | Year=1980 | Page= ] because it was well suited to the small microcomputers of that time, as it is compact and portable. At least one home computer, the British Jupiter ACE, had Forth in its ROM-resident operating system. Rockwell also produced single-chip microcomputers with resident Forth kernels, the R65F11 and R65F12.

Programmer's perspective

Forth relies heavily on explicit use of a data stack and reverse Polish notation (RPN or postfix notation), commonly used in calculators from Hewlett-Packard. In RPN, the operator is placed after its operands, as opposed to the more common infix notation where the operator is placed between its operands. Postfix notation makes the language easier to parse and extend; Forth does not use a BNF grammar, and does not have a monolithic compiler. Extending the compiler only requires writing a new word, instead of modifying a grammar and changing the underlying implementation.

Using RPN, one could get the result of the mathematical expression (25 * 10 + 50) this way:

25 10 * 50 + . 300 ok

This command line first puts the numbers 25 and 10 on the implied stack.clr

The word * multiplies the two numbers on the top of the stack and replaces them with their product.clr

Then the number 50 is placed on the stack.clr

The word + adds it to the previous product. Finally, the . command prints the result to the user's terminal. [cite book | last = Brodie | first = Leo | title = Starting Forth | format = paperback | edition = Second | year = 1987 | publisher = Prentice-Hall | id = ISBN 0-13-843079-9 | pages = 20] clr

Even Forth's structural features are stack-based. For example:

: FLOOR5 ( n -- n' ) DUP 6 < IF DROP 5 ELSE 1 - THEN ;

This code defines a new word (again, 'word' is the term used for a subroutine) called FLOOR5 using the following commands: DUP duplicates the number on the stack; &lt; compares 6 with the top number on the stack and replaces it with a true-or-false value; IF takes a true-or-false value and chooses to execute commands immediately after it or to skip to the ELSE; DROP discards the value on the stack; and THEN ends the conditional. The text in parentheses is a comment, advising that this word expects a number on the stack and will return a possibly changed number. The net result performs similarly to this function written in the C programming language:

int floor5(int v) { return v < 6 ? 5 : v - 1; }

This function is written more succinctly as:

: FLOOR5 ( n -- n' ) 1- 5 MAX ;

You would run this word as follows:

1 FLOOR5 . 5 ok 8 FLOOR5 . 7 ok

First the interpreter pushes a number (1 or 8) onto the stack, then it calls FLOOR5, which pops off this number again and pushes the result. Finally, a call to "." pops the result and prints it to the user's terminal.

Facilities

Forth parsing is simple, as it has no explicit grammar. The interpreter reads a line of input from the user input device, which is then parsed for a word using spaces as a delimiter; some systems recognise additional whitespace characters. When the interpreter finds a word, it tries to look the word up in the "dictionary". If the word is found, the interpreter executes the code associated with the word, and then returns to parse the rest of the input stream. If the word isn't found, the word is assumed to be a number, and an attempt is made to convert it into a number and push it on the stack; if successful, the interpreter continues parsing the input stream. Otherwise, if both the lookup and number conversion fails, the interpreter prints the word followed by an error message indicating the word is not recognised, flushes the input stream, and waits for new user input. [ cite book | last = Brodie | first = Leo | title = Starting Forth | format = paperback | edition = Second | year = 1987 | publisher = Prentice-Hall | id = ISBN 0-13-843079-9 | pages = 14 ]

The definition of a new word is started with the word : (colon) and ends with the word ; (semi-colon). For example

: X DUP 1+ . . ;

will compile the word X, and makes the name findable in the dictionary. When executed by typing 10 X at the console this will print 11 10. cite book | last = Brodie | first = Leo | title = Starting Forth | format = paperback | edition = Second | year = 1987 | publisher = Prentice-Hall | id = ISBN 0-13-843079-9 | pages = 16 ]

Most Forth systems include a specialized assembler that produces executable words. The assembler is a special dialect of the compiler. Forth assemblers often use a reverse-polish syntax in which the parameters of an instruction precede the instruction. The usual design of a Forth assembler is to construct the instruction on the stack, then copy it into memory as the last step. Registers may be referenced by the name used by the manufacturer, numbered (0..n, as used in the actual operation code) or named for their purpose in the Forth system: e.g. "S" for the register used as a stack pointer. [ cite web | last = Rodriguez | first = Brad | url = http://www.zetetics.com/bj/papers/6809asm.txt | title = B.Y.O.ASSEMBLER | format = HTML | accessdate = 2006-06-19 ]

Operating system, files and multitasking

Classic Forth systems traditionally use neither operating system nor file system. Instead of storing code in files, source-code is stored in disk blocks written to physical disk addresses. The word BLOCK is employed to translate the number of a 1K-sized block of disk space into the address of a buffer containing the data, which is managed automatically by the Forth system. Some implement contiguous disk files using the system's disk access, where the files are located at fixed disk block ranges. Usually these are implemented as fixed-length binary records, with an integer number of records per disk block. Quick searching is achieved by hashed access on key data.

Multitasking, most commonly cooperative round-robin scheduling, is normally available (although multitasking words and support are not covered by the ANSI Forth Standard). The word PAUSE is used to save the current task's execution context, to locate the next task, and restore its execution context. Each task has its own stacks, private copies of some control variables and a scratch area. Swapping tasks is simple and efficient; as a result, Forth multitaskers are available even on very simple microcontrollers such as the Intel 8051, Atmel AVR, and TI MSP430. [ cite web | last = Rodriguez | first = Brad | url = http://www.zetetics.com/bj/papers/8051task.pdf | title = MULTITASKING 8051 CAMELFORTH | format = PDF | accessdate = 2006-06-19 ]

By contrast, some Forth systems run under a host operating system such as Microsoft Windows, Linux or a version of Unix and use the host operating system's file system for source and data files; the ANSI Forth Standard describes the words used for I/O. Other non-standard facilities include a mechanism for issuing calls to the host OS or windowing systems, and many provide extensions that employ the scheduling provided by the operating system. Typically they have a larger and different set of words from the stand-alone Forth's PAUSE word for task creation, suspension, destruction and modification of priority.

Self compilation and cross compilation

A full-featured Forth system with all source code will compile itself, a technique commonly called meta-compilation by Forth programmers (although the term doesn't exactly match meta-compilation as it is normally defined). The usual method is to redefine the handful of words that place compiled bits into memory. The compiler's words use specially-named versions of fetch and store that can be redirected to a buffer area in memory. The buffer area simulates or accesses a memory area beginning at a different address than the code buffer. Such compilers define words to access both the target computer's memory, and the host (compiling) computer's memory. [ cite web | last = Rodriguez | first = Brad | year = 1995 | month = July | url = http://www.zetetics.com/bj/papers/moving8.htm | title = MOVING FORTH | format = HTML | accessdate = 2006-06-19 ]

After the fetch and store operations are redefined for the code space, the compiler, assembler, etc. are recompiled using the new definitions of fetch and store. This effectively reuses all the code of the compiler and interpreter. Then, the Forth system's code is compiled, but this version is stored in the buffer. The buffer in memory is written to disk, and ways are provided to load it temporarily into memory for testing. When the new version appears to work, it is written over the previous version.

There are numerous variations of such compilers for different environments. For embedded systems, the code may instead be written to another computer, a technique known as cross compilation, over a serial port or even a single TTL bit, while keeping the word names and other non-executing parts of the dictionary in the original compiling computer. The minimum definitions for such a forth compiler are the words that fetch and store a byte, and the word that commands a Forth word to be executed. Often the most time-consuming part of writing a remote port is constructing the initial program to implement fetch, store and execute, but many modern microprocessors have integrated debugging features (such as the Motorola CPU32) that eliminate this task. [ cite web | last = Shoebridge | first = Peter | date = 1998-12-21 | url = http://www.zeecube.com/archive/bdm/index.htm | title = Motorola Background Debugging Mode Driver for Windows NT | format = HTML | accessdate = 2006-06-19 ]

Structure of the language

The basic data structure of Forth is the "dictionary" which maps "words" to executable code or named data structures. The dictionary is laid out in memory as a tree of linked list with the links proceeding from the latest (most recently) defined word to oldest, until a sentinel, usually a NULL pointer, is found. A context switch causes a list search to start at a different leaf and a linked list search continues as the branch merges into the main trunk leading eventually back to the sentinel, the root. (in rare cases such as meta-compilation the dictionary might be isolated, there are several)The effect is a sophisticated use of namespaces and critically can have the effect of overloading keywords, the meaning is contextual.

A defined word generally consists of "head" and "body" with the head consisting of the "name field" (NF) and the "link field" (LF) and body consisting of the "code field" (CF) and the "parameter field" (PF). Head and body of a dictionary entry are treated separately because they may not be contiguous. For example, when a Forth program is recompiled for a new platform, the head may remain on the compiling computer, while the body goes to the new platform. In some environments (such as embedded systems) the heads occupy memory unnecessarily. However, some cross-compilers may put heads in the target if the target itself is expected to support an interactive Forth. [cite web
last = Martin
first = Harold M.
authorlink =
coauthors =
date =
year = 1991
month = March
url = http://portal.acm.org/citation.cfm?id=122089.122091&coll=portal&dl=ACM&idx=J696&part=periodical&WantType=periodical&title=ACM%20SIGFORTH%20Newsletter
title = Developing a tethered Forth model
format =
work =
pages =
publisher = ACM Press
accessdate = 2006-06-19
]

Dictionary entry

The exact format of a dictionary entry is not prescribed, and implementations vary. However, certain components are almost always present, though the exact size and order may vary. Described as a structure, a dictionary entry might look this way: [cite book
last = Brodie
first = Leo
authorlink =
coauthors =
editor =
others =
title = Starting Forth
origdate =
origyear =
origmonth =
url =
format = paperback
accessdate =
accessyear =
accessmonth =
edition = Second
date =
year = 1987
month =
publisher = Prentice-Hall
location =
id = ISBN 0-13-843079-9
doi =
pages = 200-202
chapter =
chapterurl =
quote =
]

structure byte: flag 3bit flags + length of word's name char-array: name name's runtime length isn't known at compile time address: previous link field, backward ptr to previous word address: codeword ptr to the code to execute this word any-array: parameterfield unknown length of data, words, or opcodes end-structure forthword

The name field starts with a prefix giving the length of the word's name (typically up to 32 bytes), and several bits for flags. The character representation of the word's name then follows the prefix. Depending on the particular implementation of Forth, there may be one or more NUL ('


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • Programming language — lists Alphabetical Categorical Chronological Generational A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that… …   Wikipedia

  • Concatenative programming language — Programming paradigms Agent oriented Automata based Component based Flow based Pipelined Concatenative Concurr …   Wikipedia

  • MUF (programming language) — MUF (short for Multi User Forth ) is a Forth based programming language used on TinyMUCK MUCK servers and their descendants, including Fuzzball MUCK, ProtoMUCK and GlowMUCK. MUF is the system programming language for TinyMUCK systems. Many… …   Wikipedia

  • Joy (programming language) — Joy Paradigm(s) multi paradigm: functional, concatenative, stack oriented Appeared in 2001 Designed by Manfred von Thun Developer Manfred von Thun, John Cowan St …   Wikipedia

  • Cat (programming language) — Infobox programming language name = Cat paradigm = multi paradigm: functional, stack oriented year = 2006 designer = [http://www.cat language.com Christopher Diggins] latest release version = 0.10.3 latest release date = April 3, 2007 typing =… …   Wikipedia

  • Audio programming language — An audio programming language is a programming language specifically optimized for sound and music production or sound synthesis. Some of the languages below are optimized more for music composition, and some are optimized more for synthesis. For …   Wikipedia

  • Factor (programming language) — Infobox programming language name = Factor paradigm = stack based year = 2003 developer = Slava Pestov latest release version = Continuous Builds [http://factorcode.org/binaries.fhtml] typing = strong, dynamic influenced by = Joy, Forth, Lisp,… …   Wikipedia

  • computer programming language — Introduction       any of various languages for expressing a set of detailed instructions for a digital computer. Such instructions can be executed directly when they are in the computer manufacturer specific numerical form known as machine… …   Universalium

  • Lisp (programming language) — Infobox programming language name = Lisp paradigm = multi paradigm: functional, procedural, reflective generation = 3GL year = 1958 designer = John McCarthy developer = Steve Russell, Timothy P. Hart, and Mike Levin latest release version =… …   Wikipedia

  • RPL (programming language) — The RPL programming language (RPL meaning ROM based procedural language or, alternatively, Reverse Polish LISP) is a handheld calculator system and application programming language used on Hewlett Packard s engineering graphing RPN calculators of …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”