- Visual descriptors
Visual descriptors describe the visual features of the contents in
images orvideos . They describe elementary characteristics such as theshape , thecolor , the texture or themotion , among others.Introduction
As a result of the new communication technologies and the massive use of
Internet in our society, the amount of audio-visual information available in digital format is increasing considerably. Therefore, it has been necessary to design some systems that allow us to describe the content of several types ofmultimedia information in order to search and classify them.The audio-visual descriptors are in charge of the contents description. These descriptors have a good knowledge of the objects and events found in a
video ,image oraudio and they allow the quick and efficient searches of the audio-visual content.This system can be compared to the
search engine s for textual contents. Although it is certain, that it is relatively easy to find text with a computer, is much more difficult to find concreteaudio andvideo parts. For instance, imagine somebody searching a scene of a happy person. The happiness is a feeling and it is not evident itsshape ,color and texture description inimages .The description of the audio-visual content is not a superficial task and it is essential for the effective use of this type of archives. The standardization system that deals with audio-visual descriptors is the
MPEG-7 ("Motion Picture Expert Group - 7").Types of visual descriptors
Descriptors are the first step to find out the connection between
pixels contained in adigital image and what humans recall after having observed animage or a group ofimages after some minutes.Visual descriptors are divided in two main groups:
# General information descriptors: they contain low level descriptors which give a description aboutcolor ,shape ,regions , textures andmotion .
# Specific domain information descriptors: they give information about objects and events in the scene. A concrete example would beface recognition .General information descriptors
General information descriptors consist of a set of descriptors that covers different basic and elementary features like:
color , texture,shape ,motion , location and others. This description is automatically generated by means ofsignal processing .* COLOR: the most basic quality of visual content. Five tools are defined to describe
color . The three first tools represent thecolor distribution and the last ones describe thecolor relation between sequences or group ofimages :
**"Dominant Color Descriptor (DCD)"
**"Scalable Color Descriptor (SCD)"
**"Color Structure Descriptor (CSD)"
**"Color Layout Descriptor (CLD)"
**"Group of frame (GoF)" or "Group-of-pictures (GoP)"* TEXTURE: also, an important quality in order to describe an
image . The texture descriptors characterizeimage textures or regions. They observe the regionhomogeneity and thehistograms of these region borders. The set of descriptors is formed by:
**"Homogeneous Texture Descriptor (HTD)"
**"Texture Browsing Descriptor (TBD) "
**"Edge Histogram Descriptor (EHD)"* SHAPE: contains important
semantic information due to human’s ability to recognize objects through theirshape . However, this information can only be extracted by means of asegmentation similar to the one that the human visual system implements. Nowadays, such asegmentation system is not available yet, however there exists a serial of algorithms which are considered to be a good approximation. These descriptors describe regions, contours andshapes for 2Dimages and for 3D volumes. Theshape descriptors are the following ones:
**"Region-based Shape Descriptor (RSD)"
**"Contour-based Shape Descriptor (CSD)"
**"3-D Shape Descriptor (3-D SD)"* MOTION: defined by four different descriptors which describe
motion invideo sequence.Motion is related to the objectsmotion in the sequence and to thecamera motion . This last information is provided by the capture device, whereas the rest is implemented by means ofimage processing . The descriptor set is the following one:
**"Motion Activity Descriptor (MAD)"
**"Camera Motion Descriptor (CMD)"
**"Motion Trajectory Descriptor (MTD)"
**"Warping and Parametric Motion Descriptor (WMD and PMD)"* LOCATION: elements location in the
image is used to describe elements in the spatial domain. In addition, elements can also be located in the temporal domain:
**"Region Locator Descriptor (RLD)"
**"Spatio Temporal Locator Descriptor (STLD)"pecific domain information descriptors
These descriptors, which give information about objects and events in the scene, are not easily extractable, even more when the extraction is to be automatically done. Nevertheless they can be manually processed.
As mentioned before,
face recognition is a concrete example of an application that tries to automatically obtain this information.Descriptors applications
Among all applications, the most important ones are:
*Multimedia documents search engines and classifiers.
*Digital library : visual descriptors allow a very detailed and concrete search of anyvideo orimage by means of different search parameters. For instance, the search of films where a known actor appears, the search ofvideos containing the Everest mountain, etc.
* Personalized electronic news service.
* Possibility of an automatic connection to a TV channel broadcasting a soccer match, for example, whenever a player approaches the goal area.
* Control and filtering of concrete audio-visual contents, like violent or pornographic material. Also, authorization for somemultimedia contents.ee also
MPEG-7 DSpace Feature detection References
B.S. Manjunath (Editor), Philippe Salembier (Editor), and Thomas Sikora (Editor): "Introduction to MPEG-7: Multimedia Content Description Interface". Wiley & Sons, April 2002 - ISBN 0-471-48678-7
External links
*Multimedia Content Analysis Using both Audio and Video Clues [http://vision.poly.edu:8080/~jhuang/Publication/Content_Analysis_Wang2000SP.pdf]
*Relating Visual and Semantic Image Descriptors [http://www.acemedia.org/aceMedia/files/document/wp7/2004/ewimt04-dcuThom.pdf]
*Fusing MPEG-7 visual descriptors for image classication [http://www.acemedia.org/aceMedia/files/document/wp7/2005/icann05-iti.pdf]
*MPEG-7 Quick Reference [http://gondolin.rutgers.edu/MIC/text/how/mpeg7ref.pdf]
Wikimedia Foundation. 2010.