- Document conversion
-
Document conversion is the act of converting one document's format to another, which allows the document to be read in many more applications. Documents can be converted into
- other source document formats
- consumer formats
- structured data
Contents
How it works
The conversion of the file is usually done by the application that it was created with, though there are also various third-party tools to perform it. Most file formats can be disassembled with a hex editor. Alternatively conversions can be automatically provided by Web services that connect to a document storage or delivery system - such as file directory or a document / content management applications. The content transformation services can run on a local server, in the Web or in the cloud. Conversion tools can be also combined with a delivery component - that publishes converted data into a database, filesystem or other systems.
PHi provides Online Publishing, Digital Publishing Services, Document Conversion, Electronic Publishing print and electronic publishing work.
Examples
- Converting .doc (Microsoft Word format) to .odt (OpenOffice.org format)
- Converting .ppt (Microsoft PowerPoint format) to .odp (OpenOffice.org format)
- Converting .shw (Corel Presentation format) to .ppt (Microsoft PowerPoint format)
- Converting .doc (word format) to .pdf (PDF format)
- Converting .doc (Microsoft Word format)to a Web site based on structured HTML (Hypertext)
- Converting .doc (Microsoft Word format)to .swf (flash format)
- Converting .doc (Microsoft Word format)to .mp3 (audio format)
Paper documents conversion
The task of converting scanned paper documents to useful electronic formats is one of the most important applications for document conversion. Documents, scanned to image formats have lots of limitations such as large file size, impossibility of context search and content reuse. So they should be converting to more useful formats:
- Searchable: PDF
- Archive: PDF/A – for the long-term storage
- Compressed: MRC-PDF
- Editable: TXT, RTF, DOC, XLS, PPT
- Structured: XML, HTML
Content extraction from the document image is the task of Optical Character Recognition (OCR) or Intelligent Character Recognition (ICR) technologies. Modern OCR applications convert image files to different document formats with saving not just content but also the structure of document (ADRT).
Paper documents conversion applications
Company Product Import formats Export formats Expervision TypeReader 2008 BMP, PCX, DCX, JPEG, PNG, TIFF, PDF DOC, XLS, DOCX, XLSX, RTF, TXT, HTML, DBF, CSV, PDF, ASCII (Comma Delimited or Tab Delimited) , WordPerfect, TypeReader Native Format, TypeReader Text Only
ABBYY FineReader 9.0 BMP, PCX, DCX, JPEG, JPEG 2000, PNG, TIFF, PDF, GIF, XPS, DjVu DOC, XLS, DOCX, XLSX, PPT, RTF, TXT, HTML, DBF, CSV, PDF/A, PDF, MRC-PDF, LIT, WordML Coextant Systems Hyper.Net Version 6 TXT, TIFF, JPEG, BMP, PCX, GIF, PDF PDF, PDF/A, Flash, HTML I.R.I.S. Group Readiris 12 JPEG, BMP, TIFF, PDF, DjVu, JPEG 2000 DOC, DOCX, XLS, XLSX, PDF, ODT, XPS, PDF/A, HTML, RTF, WPD Nuance Communications OmniPage Professional 17 TXT, TIFF, JPEG, BMP, PCX, GIF, PDF, MAX[disambiguation needed ] DOC, DOCX, XML, XLS, XLSX, PPTX, PDF, RTF, HTML, XSN, XPS, WordML Special PDF conversion applications:
Company Product Convert PDF from (formats) Convert PDF to (formats) ABBYY PDF Transformer 3.0 DOC, XLS, DOCX, XLSX, PPT, RTF, PPTX, VSD, VSDX and any application via printing function DOC, XLS, DOCX, XLSX, PPT, RTF, TXT, HTML, DBF, searchable PDF/A, searchable PDF Ascertia PDF Sign&Seal DOC, DOCX, XLS, XLSX, PPT, RTF - any file using File > Print JPG files Coextant Systems Hyper.Net Version 6 DOC, XLS, DOCX, PPT, RTF, PPTX, VSD, VSDX and any application via printing function searchable PDF/A, searchable PDF, Flash, MP3, Combined PDF Nitro PDF Software Nitro PDF Professional DOC, XLS, DOCX, XLSX, PPT, RTF and others DOC, DOCX, RTF, image files Software Depot Online Docsmartz PDF Converter Professional - DOC, RTF, image files, XLSX, Postscript, Text Software Depot Online Docsmartz PDF Creator DOC, XLS, DOCX, XLSX, PPT, RTF, PPTX, VSD, VSDX and any application via printing function or right click - Nuance Communications PDF Converter 6 - DOC, DOCX, XML, XLS, XLSX, PPTX, WDP, XPS, PDF, MRC-PDF Consumer format conversion applications:
Company Product Input formats Output formats Coextant Systems Hyper.Net Version 6 DOC, XLM[disambiguation needed ], DOCX, PPT, RTF, PPTX, VSD, VSDX, DWG, XLS, XLSX, OpenOffice.org formats Hypertext, HTML, Flash, MP3, PDF, PDF/A, Combined PDF, XLM[disambiguation needed ] See also
References
Categories:- Electronic documents
- Software stubs
Wikimedia Foundation. 2010.