Automatic image annotation

Automatic image annotation (also known as automatic image tagging) is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database.

This method can be regarded as a type of multi-class image classification with a very large number of classes - as large as the vocabulary size. Typically, image analysis in the form of extracted feature vectors and the training annotation words are used by machine learning techniques to attempt to automatically apply annotations to new images. The first methods learned the correlations between image features and training annotations, then techniques were developed using machine translation to try and translate the textual vocabulary with the 'visual vocabulary', or clustered regions known as "blobs". Work following these efforts have included classification approaches, relevance models and so on.

The advantages of automatic image annotation versus content-based image retrieval are that queries can be more naturally specified by the user [http://research.nii.ac.jp/~m-inoue/paper/inoue04irix.pdf] . CBIR generally (at present) requires users to search by image concepts such as color and texture, or finding example queries. Certain image features in example images may override the concept that the user is really focusing on. The traditional methods of image retrieval such as those used by libraries have relied on manually annotated images, which is expensive and time-consuming, especially given the large and constantly-growing image databases in existence.

Some annotation engines are online, including the [http://www.alipr.com ALIPR.com] real-time tagging engine developed by Penn State researchers, and [http://photo.beholdsearch.com/search.jsp Behold] - an image search engine that indexes over 1 million Flickr images using automatically generated tags.

=Some major work=:D
* Word co-occurrence model:cite conference | author=Y Mori, H Takahashi, and R Oka | title=Image-to-word transformation based on dividing and vector quantizing images with words. | booktitle=Proceedings of the International Workshop on Multimedia Intelligent Storage and Retrieval Management | year=1999 | pages=
* Annotation as machine translation :cite conference | author=P Duygulu, K Barnard, N de Fretias, and D Forsyth | url=http://vision.cs.arizona.edu/kobus/research/publications/ECCV-02-1/ | title=Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary | booktitle=Proceedings of the European Conference on Computer Vision | year=2002 | pages=97-112
* Statistical models:cite conference | author=J Li and J Z Wang | url=http://www-db.stanford.edu/~wangz/project/imsearch/ALIP/ACMMM06/ | title=Real-time Computerized Annotation of Pictures | booktitle=Proc. ACM Multimedia | year=2006 | pages=911-920:cite conference | author=J Z Wang and J Li | url=http://www-db.stanford.edu/~wangz/project/imsearch/ALIP/ACM02/ | title=Learning-Based Linguistic Indexing of Pictures with 2-D MHMMs | booktitle=Proc. ACM Multimedia | year=2002 | pages=436-445
* Automatic linguistic indexing of pictures:cite conference | author=J Li and J Z Wang | url=http://infolab.stanford.edu/~wangz/project/imsearch/ALIP/PAMI08/ | title=Real-time Computerized Annotation of Pictures | booktitle=IEEE Trans. on Pattern Analysis and Machine Intelligence | year=2008 | pages=:cite conference | author=J Li and J Z Wang | url=http://www-db.stanford.edu/~wangz/project/imsearch/ALIP/PAMI03/ | title=Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach | booktitle=IEEE Trans. on Pattern Analysis and Machine Intelligence | year=2003 | pages=1075-1088

* Hierarchical Aspect Cluster Model :cite conference | author=K Barnard, D A Forsyth | url=http://kobus.ca/research/publications/ICCV-01/ | title=Learning the Semantics of Words and Pictures | booktitle=Proceedings of International Conference on Computer Vision | year=2001 | pages=408-415
* Latent Dirichlet Allocation model :cite conference | author=D Blei, A Ng, and M Jordan | url=http://www.ics.uci.edu/~liang/seminars/win05/papers/blei03-latent-dirichlet.pdf | title=Latent Dirichlet allocation | booktitle=Journal of Machine Learning Research | year=2003 | pages=3:993-1022
* Supervised Multiclass Labeling:cite conference | author=G Carneiro, A B Chan, P Moreno, and N Vasconcelos| url=http://www.svcl.ucsd.edu/publications/journal/2007/pami/pami07-semantics.pdf | title=Supervised Learning of Semantic Classes for Image Annotation and Retrieval| booktitle=IEEE Trans. on Pattern Analysis and Machine Intelligence | year=2006 | pages=394-410
* Texture similarity :cite conference | author=R W Picard and T P Minka | url=http://citeseer.ist.psu.edu/picard95vision.html | title=Vision Texture for Annotation | booktitle=Multimedia Systems | year=1995 | pages=
* Support Vector Machines:cite conference | author=C Cusano, G Ciocca, and R Scettini | url=http://adsabs.harvard.edu/cgi-bin/nph-bib_query?bibcode=2003SPIE.5304..330C&db_key=INST | title=Image Annotation Using SVM | booktitle=Proceedings of Internet Imaging IV | year=2004 | pages=
* Ensemble of Decision Trees and Random Subwindows:cite conference | author=R Maree, P Geurts, J Piater, and L Wehenkel
url=http://www.montefiore.ulg.ac.be/~maree/#publications | title=Random Subwindows for Robust Image Classification
booktitle=Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition | year=2005 | pages=1:34-30
* Maximum Entropy:cite conference | author=J Jeon, R Manmatha | url=http://ciir.cs.umass.edu/pubfiles/mm-355.pdf | title=Using Maximum Entropy for Automatic Image Annotation | booktitle=Int'l Conf on Image and Video Retrieval (CIVR 2004)| year=2004 | pages=24-32
* Relevance models :cite conference | author=J Jeon, V Lavrenko, and R Manmatha | url=http://ciir.cs.umass.edu/pubfiles/mm-41.pdf | title=Automatic image annotation and retrieval using cross-media relevance models | booktitle=Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval | year=2003 | pages=119-126
* Relevance models using continuous probability density functions :cite conference | author=V Lavrenko, R Manmatha, and J Jeon | url=http://ciir.cs.umass.edu/pubfiles/mm-46.pdf | title=A model for learning the semantics of pictures | booktitle=Proceedings of the 16th Conference on Advances in Neural Information Processing Systems NIPS | year=2003 | pages=
* Coherent Language Model:cite conference | author=R Jin, J Y Chai, L Si | url=http://www.cse.msu.edu/~rongjin/publications/acmmm04.jin.pdf | title=Effective Automatic Image Annotation via A Coherent Language Model and Active Learning | booktitle=Proceedings of MM'04 | year=2004 | pages=
* Inference networks :cite conference | author=D Metzler and R Manmatha | url=http://ciir.cs.umass.edu/pubfiles/mm-346.pdf | title=An inference network approach to image retrieval | booktitle=Proceedings of the International Conference on Image and Video Retrieval | year=2004 | pages=42-50
* Multiple Bernoulli distribution :cite conference | author=S Feng, R Manmatha, and V Lavrenko | url=http://ciir.cs.umass.edu/pubfiles/mm-333.pdf | title=Multiple Bernoulli relevance models for image and video annotation | booktitle=IEEE Conference on Computer Vision and Pattern Recognition | year=2004 | pages=1002-1009
* Multiple design alternatives:cite conference | author=J Y Pan, H-J Yang, P Duygulu and C Faloutsos | url=http://www.informedia.cs.cmu.edu/documents/ICME04AutoICap.pdf | title=Automatic Image Captioning | booktitle=Proceedings of the 2004 IEEE International Conference on Multimedia and Expo (ICME'04) | year=2004 | pages=
* Natural scene annotation:cite conference | author=J Fan, Y Gao, H Luo and G Xu | url=http://portal.acm.org/ft_gateway.cfm?id=1009055&type=pdf&coll=GUIDE&dl=GUIDE&CFID=1581830&CFTOKEN=99651762 | title=Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation | booktitle=Proceedings of the 27th annual international conference on Research and development in information retrieval | year=2004 | pages=361-368
* Relevant low-level global filters :cite conference | author=A Oliva and A Torralba | url=http://cvcl.mit.edu/Papers/IJCV01-Oliva-Torralba.pdf | title=Modeling the shape of the scene: a holistic representation of the spatial envelope | booktitle=International Journal of Computer Vision | year=2001 | pages=42:145-175
* Global image features and nonparametric density estimation :cite conference | author=A Yavlinsky, E Schofield and S Rüger | url=http://km.doc.ic.ac.uk/www-pub/civr05-annotation.pdf | title=Automated Image Annotation Using Global Features and Robust Nonparametric Density Estimation | booktitle=Int'l Conf on Image and Video Retrieval (CIVR, Singapore, Jul 2005) | year=2005 | pages=
* Video semantics :cite conference | author=N Vasconcelos and A Lippman | url=http://www.svcl.ucsd.edu/publications/journal/2000/ip/ip00.pdf | title=Statistical Models of Video Structure for Content Analysis and Characterization | booktitle=IEEE Transactions on Image Processing | year=2001 | pages=1-17
* Image Annotation Refinement:cite conference | author=Yohan Jin, Latifur Khan, Lei Wang, and Mamoun Awad| url=http://portal.acm.org/citation.cfm?id=1101305&dl=GUIDE, | title=Image annotations by combining multiple evidence & wordNet | booktitle=13th Annual ACM International Conference on Multimedia (MM 05) |year=2005 | pages=706 - 715:cite conference | author=Changhu Wang, Feng Jing, Lei Zhang, and Hong-Jiang Zhang
url=http://portal.acm.org/citation.cfm?id=1180639.1180774#, | title=Image annotation refinement using random walk with restarts | booktitle =14th Annual ACM International Conference on Multimedia (MM 06) |year=2006:cite conference | author=Changhu Wang, Feng Jing, Lei Zhang, and Hong-Jiang Zhang
url=http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4270246 | title=content-based image annotation refinement | booktitle=IEEE Conference on Computer Vision and Pattern Recognition (CVPR 07)| year=2007
* Automatic Image Annotation by Ensemble of Visual Descriptors:cite conference | author=Emre Akbas and Fatos Y. Vural | url=http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4270482 | title=Automatic Image Annotation by Ensemble of Visual Descriptors | booktitle=Intl. Conf. on Computer Vision (CVPR) 2007, Workshop on Semantic Learning Applications in Multimedia | year=2007

ee also

*Pattern recognition
*Image retrieval
*Content-based image retrieval

References

External links

* [http://www.alipr.com/ ALIPR.com] - Real-time automatic tagging engine developed by Penn State researchers.
* [http://photo.beholdsearch.com/search.jsp Behold Image Search] - An image search engine that indexes over 1 million Flickr images using automatically generated tags.

Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

Annotation Automatique D'images — L annotation automatique d images est le procédé par lequel un système informatique assigne automatiquement une légende ou des mots clés à une image numérique. Cette application des techniques issues de la vision par ordinateur est utilisée dans… … Wikipédia en Français
Image retrieval — An image retrieval system is a computer system for browsing, searching and retrieving images from a large database of digital images. Most traditional and common methods of image retrieval utilize some method of adding metadata such as captioning … Wikipedia
Annotation — An annotation is a note that is made while reading any form of text. This may be as simple as underlining or highlighting passages. Annotated bibliographies give descriptions about how each source is useful to an author in constructing a paper or … Wikipedia
Annotation automatique d'images — L annotation automatique d images est le procédé par lequel un système informatique assigne automatiquement une légende ou des mots clés à une image numérique. Cette application des techniques issues de la vision par ordinateur est utilisée dans… … Wikipédia en Français
Image organizer — An image organizer or image management application is application software focused on organizing digital images. [Cynthia Baron and Daniel Peck, The Little Digital Camera Book , July 1, 2002 pp:93] [ Julie Adair King, Shoot Like a Pro! Digital… … Wikipedia
Automatic label placement — (sometimes called text placement or name placement) refers to the computer methods of placing labels automatically on a map or chart. This is related to the typographic design of such labels. Maps communicate spatial information to the reader,… … Wikipedia
Content-based image retrieval — (CBIR), also known as query by image content (QBIC) and content based visual information retrieval (CBVIR) is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in… … Wikipedia
Digital photography — Nikon D700 a 12.1 megapixel full frame DSLR … Wikipedia
Outline of artificial intelligence — The following outline is provided as an overview of and topical guide to artificial intelligence: Artificial intelligence (AI) – branch of computer science that deals with intelligent behavior, learning, and adaptation in machines. Research in AI … Wikipedia
Champ aléatoire de Markov — Les Champs aléatoires de Markov forment une famille d outils en Analyse spatiale et en fouille de données spatiales permettant la classification des phénomènes géolocalisés. Dans ces modèles les relations d interdépendances sont décrites par un… … Wikipédia en Français

Academic Dictionaries and Encyclopedias

Automatic image annotation

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Automatic image annotation

Look at other dictionaries:

Share the article and excerpts

Direct link