- Bijankhan Corpus
The Bijankhan corpus is a tagged corpus that is suitable for natural language processing research on the
Persian language . This collection is gathered from daily news and common texts. In this collection all documents are categorized into different subjects such as political, cultural, etc; in about 4300 different subject categories. The Bijankhan collection contains about 2.6 million manually tagged words with a tag set that contains 550 Persian part-of-speech tags.Bijankhan corpus was created by the Data Base Research Group at the
University of Tehran . The corpus isnon-free in that it is not free for commercial use.ee also
*
Hamshahri Corpus
*Persian Today Corpus External links
* [http://ece.ut.ac.ir/dbrg/Bijankhan Bijankhan corpus] .
Wikimedia Foundation. 2010.