DataparkSearch

DataparkSearch: DataparkSearch
Developer(s) Maxim Zakharov

Initial release 27 November 2003

Stable release 4.53 / January 24, 2010; 21 months ago (2010-01-24)

Written in C

Operating system FreeBSD, Linux, Solaris

Type search engine open source

License GNU General Public License

Website http://www.dataparksearch.org/

DataparkSearch is a search engine designed to organize search within a website, group of websites, intranet or local system.

DataparkSearch is written in C. Distributed under the terms of the GNU General Public License, DataparkSearch is free software.

In 2005, DataparkSearch participated in the US National Institutes of Standards and Technology's Text Retrieval Conference (TREC). Their submission in PDF. Results of their runs: dpsearch1, dpsearch2.

Key features

Support for http, https, ftp, nntp and news URL schemes.

htdb virtual URL scheme for indexing SQL databases.

Indexes text/html, text/xml, text/plain, audio/mpeg (mp3) and image/gif mime types natively.

External parsers support for other document types, including Microsoft Word, Excel, RTF, PowerPoint, Adobe Acrobat PDF and Flash.

Can index multilingual sites using content negotiation.

Can search all of the word forms using ispell affixes and dictionaries.

Synonym, acronym and abbreviation query expansion based on editable dictionaries, specified by language and charset.

Stop-words, synonyms and acronyms lists.

Options to query with all words, all words near to each others, any words, or Boolean queries. A subset of VQL (Verity Query Language) is supported.

Popularity Rank based on a neural network model.

Results can be sorted by relevancy (using vector calculation), popularity rank as "Goo" (adding weight for incoming links), and "Neo" (neural network model), last modified time, and by "importance" (a combination of relevancy and popularity rank).

Supports wide range of character sets support with automated character set and language detection.

Offers an accent insensitive search option.

Provides phrase segmenting (tokenizing) for Chinese, Japanese, Korean and Thai.

Includes an indexer and a web CGI front-end, as well as a search module for Apache web server (mod_dpsearch).

Handles Internationalized Domain Names (IDN).

Summary Extraction Algorithm automatically sums up each document in several sentences.

Uses If-Modified-Since for efficient transfer of only changed files.

Can tweak URLs with session IDs and other weird formats, including some JavaScript link decoding.

Can perform parallel and multi-threaded indexing for faster updating.

Flexible update scheduling, including options for checking some sections of a site more frequently.

Handles basic authentication (user name and password) and cookies.

Stores a compressed text version of the documents for extracting and viewing.

Can specify a default character set and language for a server or subdirectory, or a list of possible languages.

Noindex tags: , <NOINDEX>, , Google's special comments

,  and  consider as tags to include/exclude.

Can specify a content body tag.

Spellchecking for query words with aspell.

Flexible options and commands to customize search result pages.

Effective caching gives significant time reduction in search times.

Query logging stores the query, query parameters and the number of results found.

External links

Free software portal

Official page of the project

Home at Google Code

FreeBSD's port

Search Tools Product Report: DataparkSearch Engine

Newslookup.com -- A news service using DataparkSearch Engine.

This free software-related article is a stub. You can help Wikipedia by expanding it.

Academic Dictionaries and Encyclopedias

DataparkSearch

Key features

External links

Look at other dictionaries:

Share the article and excerpts

Developer(s)	Maxim Zakharov
Initial release	27 November 2003
Stable release	4.53 / January 24, 2010; 21 months ago (2010-01-24)
Written in	C
Operating system	FreeBSD, Linux, Solaris
Type	search engine open source
License	GNU General Public License
Website	http://www.dataparksearch.org/

Academic Dictionaries and Encyclopedias

Wikipedia

DataparkSearch

Key features

External links

Look at other dictionaries:

Share the article and excerpts

Direct link