CuneiForm (software)

CuneiForm (software)
CuneiForm
Original author(s) Cognitive Technologies
Developer(s) Cognitive Technologies
Stable release 1.1 / April 19, 2011; 6 months ago (2011-04-19)
Preview release sources / April 2, 2008; 3 years ago (2008-04-02)
Written in C and C++
Operating system Cross-platform
Type Optical character recognition
License Freeware/BSD licenses
Website openocr.org

In computer software, CuneiForm is an OCR tool. It was originally developed at Cognitive Technologies and, after a few years with no development, released as freeware on December 12, 2007. The kernel of OCR engine was released under the open source BSD license license at the beginning of April 2008.[1]

Contents

Features

CuneiForm uses the OmniFont system[clarification needed]. Algorithms used in CuneiForm come from the rules for writing letters, from their topology, and do not require pattern recognition learning. CuneiForm recognizes any print font (scanned from books, newspapers, magazines, laser printer output, dot-matrix printer output, typewriter text, etc.). It does not recognize handwritten or pseudo-handwritten text nor does it recognize decorative fonts (e.g. Gothic). There are special settings in CuneiForm for recognition of text from dot-matrix printer and 200x100 DPI resolution faxes.

CuneiForm can save text formatting, and also recognizes complicated tables (of any structure).

It recognizes Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, French, German, Hungarian, Italian, Latvian, Lithuanian, Polish, Portuguese, Romanian, Russian, Russian-English bilingual, Serbian, Slovene, Spanish, Swedish, Turkish, and Ukrainian text.

CuneiForm can save recognized text in RTF, HTML, or plain text format. It can also pass text to Microsoft Word or Microsoft Excel.

User interface

CuneiForm can be used as a stand-alone command-line application, or as a back-end to other programs. It comes with its own graphic interface. CuneiForm can be also used as an OCR engine in OCRFeeder[2].

History

Once a leader of OCR software in Russia, CuneiForm was in competition with ABBYY FineReader.

In 1993, Cognitive Technologies signed an OEM contract with Corel Corporation, which allowed the Cognitive recognition library to be built into the popular publishing package Corel Draw 3.0 (and subsequent versions).

In 1996, OCR CuneiForm'96 was released, which was the first OCR package to include the adaptive recognition method of character recognition. This method is based on a combination of two types of printed characters recognition algorithms: multifont and omnifont. This self-learning system is capable of recognizing poorly printed symbols by creating an internal font generated by those symbols which were printed well enough to be recognized. Thus dynamic adjustment (adaptation) for specific input characters is used.

In June, 2008 Cognitive Technologies launched a free on-line recognition service on OpenOCR.org[where?].

Opening sources

Cognitive Technologies has started a program to make OCR available for all users. Its first step was releasing CuneiForm as freeware.

Cognitive Technologies plans to start developing a new version of the software as an investor and coordinator of the project. Developers decided on the BSD license for the release to take into account all legal and technical nuances, but the whole program or its separate modules may be released later licensed under the GPL.[3]

In September 2008, part of Cuneiform was released as open source software. One of the missing parts is table analysis, However, Cognitive has promised to release this component in the future.

Cuneiform is being ported to Linux, BSD and Mac OS X [4]. This branch of code will finally be merged with Cognitive codebase.

References

External links


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • Cuneiform (disambiguation) — Cuneiform (from the Latin word for wedge shaped ) can refer to: Cuneiform script, an ancient writing system originating in Mesopotamia in the 4th millennium BC Cuneiform (anatomy), three bones in the human foot Cuneiform Records, a music record… …   Wikipedia

  • CuneiForm — Entwickler Cognitive Technologies Aktuelle Version 0.1.0 (14. Februar 2009) Betriebssystem Windows (Linux und FreeBSD Portierungen verfügbar) Kategorie …   Deutsch Wikipedia

  • List of optical character recognition software — An OCR SDK is a software development kit for adding optical character recognition capabilities to forms processing applications, document imaging management systems, e discovery systems and records management solutions. In order to avoid the… …   Wikipedia

  • Free software Unicode typefaces — A few projects exist to provide free software Unicode typefaces, i.e. Unicode typefaces which are free software and designed to contain glyphs of all Unicode characters. However there are also numerous projects aimed at providing only a certain… …   Wikipedia

  • 3D scanner — A 3D scanner is a device that analyzes a real world object or environment to collect data on its shape and possibly its appearance (i.e. color). The collected data can then be used to construct digital, three dimensional models useful for a wide… …   Wikipedia

  • information processing — Acquisition, recording, organization, retrieval, display, and dissemination of information. Today the term usually refers to computer based operations. Information processing consists of locating and capturing information, using software to… …   Universalium

  • BIBLE — THE CANON, TEXT, AND EDITIONS canon general titles the canon the significance of the canon the process of canonization contents and titles of the books the tripartite canon …   Encyclopedia of Judaism

  • OCRFeeder — Developer(s) Joaquim Rocha (Igalia) …   Wikipedia

  • Devanagari — Nagari redirects here. For other uses, see Nagari (disambiguation). Devanāgarī Rigveda manuscript in Devanāgarī (early 19th century) Type abugida …   Wikipedia

  • Writing system — Predominant scripts at the national level, with selected regional and minority scripts. Alphabet Latin Cyrillic Latin Greek …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”