Motion estimation

Motion estimation

Motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion is in three dimensions but the images are a projection of the 3D scene onto a 2D plane. The motion vectors may relate to the whole image (global motion estimation) or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel. The motion vectors may be represented by a translational model or many other models that can approximate the motion of a real video camera, such as rotation and translation in all three dimensions and zoom.

Closely related to motion estimation is optical flow, where the vectors correspond to the perceived movement of pixels. In motion estimation an exact 1:1 correspondence of pixel positions is not a requirement.

Applying the motion vectors to an image to synthesize the transformation to the next image is called motion compensation. The combination of motion estimation and motion compensation is a key part of video compression as used by MPEG 1, 2 and 4 as well as many other video codecs.



The methods for finding motion vectors can be categorised into pixel based methods ("direct") and feature based methods ("indirect"). A famous debate resulted in two papers from the opposing factions being produced to try to establish a conclusion.[1][2]

Direct Methods

Evaluation Metrics

In direct methods several evaluation metrics can be used.

Indirect Methods

Indirect methods use features, such as Harris corners, and match corresponding features between frames, usually with a statistical function applied over a local or global area. The purpose of the statistical function is to remove matches that do not correspond to the actual motion.

Statistical functions that have been successfully used include RANSAC.


  1. ^ Philip H.S. Torr and Andrew Zisserman: Feature Based Methods for Structure and Motion Estimation, ICCV Workshop on Vision Algorithms, pages 278-294, 1999
  2. ^ Michal Irani and P. Anandan: About Direct Methods, ICCV Workshop on Vision Algorithms, pages 267-277, 1999.

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Motion interpolation — is a form of video processing in which intermediate animation frames are generated between existing ones, in an attempt to make animation more fluid. Contents 1 Applications 1.1 HDTV 1.2 Side effects 1.2.1 …   Wikipedia

  • Motion compensation — is an algorithmic technique employed in the encoding of video data for video compression, for example in the generation of MPEG 2 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current… …   Wikipedia

  • Motion detection — is a process of confirming a change in position of an object relative to its surroundings or the change in the surroundings relative to an object. This detection can be achieved by both mechanical and electronic methods. In addition to discrete,… …   Wikipedia

  • Estimation de mouvement — L estimation de mouvement ou Motion estimation(en) est un procédé qui consiste à étudier le déplacement des objets dans une séquence vidéo, en cherchant la corrélation entre deux images successives afin de prédire le changement de position du… …   Wikipédia en Français

  • Motion perception — The dorsal stream (green) and ventral stream (purple) are shown. They originate from a common source in visual cortex. The dorsal stream is responsible for detection of location and motion. Motion perception is the process of inferring the speed… …   Wikipedia

  • Motion vector — In video compression, a motion vector is the key element in the motion estimation process. It is used to represent a macroblock in a picture based on the position of this macroblock (or a similar one) in another picture, called the reference… …   Wikipedia

  • Motion tracking — Match moving Le match moving est une technique utilisée dans le domaine des effets spéciaux et liée à la motion capture. Ce terme est employé pour faire référence aux différentes techniques permettant d extraire les informations de mouvement… …   Wikipédia en Français

  • Motion field — In computer vision the motion field is an ideal representation of 3D motion as it is projected onto a camera image. Given a simplified camera model, each point (y1,y2) in the image is the projection of some point in the 3D scene but the position… …   Wikipedia

  • Motion-JPEG 2000 — MJPEG2000 MJPEG 2000 ou Motion JPEG 2000 est la partie 3 de la norme de compression d’images JPEG 2000 et est une application à la vidéo. Le principe est très simple : chaque image de la vidéo est codée au format JPEG 2000. Une vidéo MJPEG… …   Wikipédia en Français

  • Articulated body pose estimation — Articulated body pose estimation, in computer vision, is the study of algorithms and systems that recover the pose of an articulated body, which consists of joints and rigid parts using image based observations. It is one of longest lasting… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”