Eight-point algorithm

Eight-point algorithm

The eight-point algorithm is an algorithm used in computer vision to estimate the essential matrix or the fundamental matrix related to a stereo camera pair from a set of corresponding image points. It was introduced by Christopher Longuet-Higgins in 1981 for the case of the essential matrix. In theory, this algorithm can be used also for the fundamental matrix, but in practice the normalized eight-point algorithm, described by Richard Hartley in 1997, is better suited for this case.

The algorithm's name derives from the fact that it estimates the essential matrix or the fundamental matrix from a set of eight (or more) corresponding image points. However, variations of the algorithm can be used for fewer than eight points.

The basic algorithm

The basic eight-point algorithm is here described for the case of estimating the essential matrix mathbf{E} . It consists of three steps. First, it formulates a homogeneous linear equation, where the solution is directly related to mathbf{E} , and then solves the equation, taking into account that it may not have an exact solution. Finally, the internal constraints of the resulting matrix are managed. The first step is described in Longuet-Higgins' paper, the second and third steps are standard approaches in estimation theory.

The constraint defined by the essential matrix mathbf{E} is

: (mathbf{y}')^{T} , mathbf{E} , mathbf{y} = 0

for corresponding image points represented in normalized image coordinates mathbf{y}, mathbf{y}' . The problem which the algorithm solves is to determine mathbf{E} for a set of matching image points. In practice, the image coordinates of the image points are affected by noise and the solution may also be over-determined which means that it may not be possible to find mathbf{E} which satisfies the above constraint exactly for all points. This issue is addressed in the second step of the algorithm.

Step 1: Formulating a homogeneous linear equation

With

: mathbf{y} = egin{pmatrix} y_{1} \ y_{2} \ 1 end{pmatrix} and mathbf{y}' = egin{pmatrix} y'_{1} \ y'_{2} \ 1 end{pmatrix} and mathbf{E} = egin{pmatrix} e_{11} & e_{12} & e_{13} \ e_{21} & e_{22} & e_{23} \ e_{31} & e_{32} & e_{33} end{pmatrix}

the constraint can also be rewritten as

: y'_1 y_1 e_{11} + y'_1 y_2 e_{12} + y'_1 e_{13} + y'_2 y_1 e_{21} + y'_2 y_2 e_{22} + y'_2 e_{23} + y_1 e_{31} + y_2 e_{32} + e_{33} = 0 ,

or

: mathbf{e} cdot ildemathbf{y} = 0

where

: ildemathbf{y} = egin{pmatrix} y'_1 y_1 \ y'_1 y_2 \ y'_1 \ y'_2 y_1 \ y'_2 y_2 \ y'_2 \ y_1 \ y_2 \ 1 end{pmatrix} and mathbf{e} = egin{pmatrix} e_{11} \ e_{12} \ e_{13} \ e_{21} \ e_{22} \ e_{23} \ e_{31} \ e_{32} \ e_{33} end{pmatrix}

that is, mathbf{e} represents the essential matrix in the form of a 9-dimensional vector and this vector must be orthogonal to the vector ildemathbf{y} which can be seen as a vector representation of the 3 imes 3 matrix mathbf{y}' , mathbf{y}^{T} .

Each pair of corresponding image points produces a vector ildemathbf{y} . Given a set of 3D points mathbf{P}_k this corresponds to a set of vectors ildemathbf{y}_{k} and all of them must satisfy

: mathbf{e} cdot ildemathbf{y}_{k} = 0

for the vector mathbf{e} . Given sufficiently many (at least eight) and linearly independent vectors ildemathbf{y}_{k} it is then possible to determine mathbf{e} in a straight-forward way. Collect all vectors ildemathbf{y}_{k} as the columns of a matrix mathbf{Y} and it must then be the case that

: mathbf{e}^{T} , mathbf{Y} = mathbf{0}

This means that mathbf{e} is the solution to a homogeneous linear equation.

Step 2: Solving the equation

A standard approach to solving this equation implies that mathbf{e} is a left singular vector of mathbf{Y} corresponding to a singular value that equals zero. Provided that at eight vectors ildemathbf{y}_{k} , which in addition are linearly independent, are used to construct mathbf{Y} it follows that this singular vector is unique (disregarding scalar multiplication) and, consequently, mathbf{e} and then mathbf{E} can be determined.

In the case that more than eight corresponding points are used to construct mathbf{Y} it is possible that it does not have any singular value equal to zero. This case occurs in practice when the image coordinates are affected by various types of noise. A common approach to deal with this situation is to describe it as a total least squares problem; find mathbf{e} which minimizes

: | mathbf{e}^{T} , mathbf{Y} |

when | mathbf{e} | = 1 . The solution is to choose mathbf{e} as the left singular vector corresponding to the "smallest" singular value of mathbf{Y} . A reordering of this mathbf{e} back into a 3 imes 3 matrix gives the result of this step, here referred to as mathbf{E}_{ m est} .

Step 3: Enforcing the internal constraint

Another consequence of dealing with noisy image coordinates is that the resulting matrix may not satisfy the internal constraint of the essential matrix, that is, it has two equal and non-zero singular values and one singular values which is zero. Depending on the application, smaller or larger deviations from the internal constraint may or may not be a problem. If it is critical that the estimated matrix satisfies the internal constraints, this can be accomplished by finding the matrix mathbf{E}' of rank 2 which minimizes

: | mathbf{E}' - mathbf{E}_{ m est} |

where mathbf{E}_{ m est} is the resulting matrix from Step 2 and the Frobenius matrix norm is used. The solution to the problem is given by first computing a singular value decomposition of mathbf{E}_{ m est} :

: mathbf{E}_{ m est} = mathbf{U} , mathbf{S} , mathbf{V}^{T}

where mathbf{U}, mathbf{V} are orthogonal matrices and mathbf{S} is a diagonal matrix which contains the singular values of mathbf{E}_{ m est} . In the ideal case, one of the diagonal elements of mathbf{S} should be zero, or at least small compared to the other two which should be equal. In any case, set

: mathbf{S}' = egin{pmatrix} 1 & 0 & 0 \ 0 & 1 & 0 \ 0 & 0 & 0 end{pmatrix}

Finally, mathbf{E}' is given by

: mathbf{E}' = mathbf{U} , mathbf{S}' , mathbf{V}^{T}

The matrix mathbf{E}' is the resulting estimate of the essential matrix provided by the algorithm.

The normalized eight-point algorithm

The basic eight-point algorithm can in principle be used also for estimating the fundamental matrix mathbf{F} . The defining constraint for mathbf{F} is

: (mathbf{y}')^{T} , mathbf{F} , mathbf{y} = 0

where mathbf{y}, mathbf{y}' are the homogeneous representations of corresponding image coordinates (not necessary normalized). This means that it is possible to form a matrix mathbf{Y} in a similar way as for the essential matrix and solve the equation

: mathbf{f}^{T} , mathbf{Y} = mathbf{0}

for mathbf{f} which is a reshaped version of mathbf{F} . By followíng the procedure outlines above, it is then possible to determine mathbf{F} from a set of eight matching points. In practice, however, the resulting fundamental matrix may not be useful for determining epipolar constraints.

The problem

The problem is that the resulting mathbf{Y} often is ill-conditioned. In theory, mathbf{Y} should have one singular value equal to zero and the rest are non-zero. In practice, however, some of the non-zero singular values can become small relative to the larger ones. If more than eight corresponding points are used to construct mathbf{Y} , where the coordinates are only approximately correct, there may not be a well-defined singular value which can be identified as approximately zero. Consequently, the solution of the homogeneous linear system of equations may not be sufficiently accurate to be useful.

What's causing the problem

Hartley addressed this estimation problem in his 1997 article. His analysis of the problem shows that the problem is caused by the homogeneous image coordinates, vectors in mathbb{R}^{3} , can have a poor distribution in that space. A typical homogeneous representation of the 2D image coordinate (y_{1}, y_{2}) , is

: mathbf{y} = egin{pmatrix} y_{1} \ y_{2} \ 1 end{pmatrix}

where both y_{1}, y_{2} , lie in the range 0 to 1000-2000 for a modern digital camera. This means that the first two coordinates in mathbf{y} vary over a much larger range than the third coordinate. Furthermore, if the image points which are used to construct mathbf{Y} lie in a relatively small region of the image, for example at (700,700) pm (100,100) , , again the vector mathbf{y} points in more or less the same direction for all points. As a consequence, mathbf{Y} will have one large singular value and the remaining are small.

How it can be solved

As a solution to this problem, Hartley proposed that the coordinate system of each of the two images should be transformed, independently, into a new coordinate system according to the following principle.

* The origin of the new coordinate system should be centered (have its origin) at the centroid (center of gravity) of the image points. This is accomplished by a translation of the original origin to the new one.
* After the translation the coordinates are uniformly scaled so that the mean distance from the origin to a point equals sqrt{2} .

This principle results, normally, in a distinct coordinate transformation for each of the two images. As a result, new homogeneous image coordinates mathbf{ar y}, mathbf{ar y}' are given by

: mathbf{ar y} = mathbf{T} , mathbf{y} : mathbf{ar y}' = mathbf{T}' , mathbf{y}'

where mathbf{T}, mathbf{T}' are the transformations (translation and scaling) from the old to the new "normalized image coordinates". This normalization is only dependent on the image points which are used in a single image and is, in general, distinct from normalized image coordinates produced by a normalized camera.

The epipolar constraint based on the fundamental matrix can now be rewritten as

: 0 = (mathbf{ar y}')^{T} , ((mathbf{T}')^{T})^{-1} , mathbf{F} , mathbf{T}^{-1}, mathbf{ar y} = (mathbf{ar y}')^{T} , mathbf{ar F} , mathbf{ar y}

where mathbf{ar F} = ((mathbf{T}')^{T})^{-1} , mathbf{F} , mathbf{T}^{-1} . This means that it is possible to use the normalized homogeneous image coordinates mathbf{ar y}, mathbf{ar y}' to estimate the transformed fundamental matrix mathbf{ar F} using the basic eight-point algorithm described above.

The purpose of the normalization transformations is that the matrix mathbf{ar Y} , constructed from the normalized image coordinates, in general has a better condition number than mathbf{Y} has. This means that the solution mathbf{ar f} is more well-defined as a solution of the homogeneous equation mathbf{ar Y} , mathbf{ar f} than mathbf{f} is relative to mathbf{Y} . Once mathbf{ar f} has been determined and reshaped into mathbf{ar F} the latter can be "de-normalized" to give mathbf{F} according to

: mathbf{F} = (mathbf{T}')^{T} , mathbf{ar F} , mathbf{T}

In general, this estimate of the fundamental matrix is a better one than would have been obtained by estimating from the un-normalized coordinates.

Using fewer than eight points

Each point pair contributes with one constraining equation on the element in mathbf{E} . Since mathbf{E} has five degrees of freedom it should therefore be sufficient with only five point pairs to determined mathbf{E} . Though possible from a theoretical point of view, the practical implementation of this is not straight-forward and have to rely on solving various non-linear equations.

References

* cite journal
author=Richard I. Hartley
title=In Defense of the Eight-Point Algorithm
year=1997
month=June
journal=IEEE Transaction on Pattern Recognition and Machine Intelligence
volume=19
issue=6
pages=580–593
doi=10.1109/34.601246

* cite book
author=Richard Hartley and Andrew Zisserman
title=Multiple View Geometry in computer vision
publisher=Cambridge University Press
year=2003
id=ISBN 978-0-521-54051-3

* cite journal
author=H. Christopher Longuet-Higgins
title=A computer algorithm for reconstructing a scene from two projections
journal=Nature
year=1981
month=Sep
volume=293
pages=133–135
doi=10.1038/293133a0


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • Eight queens puzzle — a b c d e f g h …   Wikipedia

  • Difference-map algorithm — Iterations 0, 100, 200, 300 and 400 in the difference map reconstruction of a grayscale image from its Fourier transform modulus The difference map algorithm is a search algorithm for general constraint satisfaction problems. It is a meta… …   Wikipedia

  • Difference map algorithm — The difference map algorithm is a search algorithm for general constraint satisfaction problems. It is a meta algorithm in the sense that it is built from more basic algorithms that perform projections onto constraint sets. From a mathematical… …   Wikipedia

  • Shor's algorithm — Shor s algorithm, first introduced by mathematician Peter Shor, is a quantum algorithm for integer factorization. On a quantum computer, to factor an integer N, Shor s algorithm takes polynomial time in log{N}, specifically O((log{N})^3),… …   Wikipedia

  • Decimal floating point — Floating point precisions IEEE 754: 16 bit: Half (binary16) 32 bit: Single (binary32), decimal32 64 bit: Double (binary64), decimal64 128 bit: Quadruple (binary128), decimal128 Other: Minifloat · Extended precision Arbitrary precision… …   Wikipedia

  • Meter Point Administration Number — A Meter Point Administration Number, also known as MPAN, Supply Number or S Number, is a 21 digit reference used in Great Britain to uniquely identify electricity supply points such as individual domestic residences. The gas equivalent is the… …   Wikipedia

  • Floating point — In computing, floating point describes a method of representing real numbers in a way that can support a wide range of values. Numbers are, in general, represented approximately to a fixed number of significant digits and scaled using an exponent …   Wikipedia

  • Essential matrix — In computer vision, the essential matrix is a 3 imes 3 matrix mathbf{E} , with some additional properties, which relates corresponding points in stereo images assuming that the cameras satisfy the pinhole camera model.FunctionMore specifically,… …   Wikipedia

  • Maximally stable extremal regions — Feature detection Output of a typical corner detection algorithm …   Wikipedia

  • Fundamental matrix (computer vision) — In computer vision, the fundamental matrix mathbf{F} is a 3 imes 3 matrix of rank 2 which relates corresponding points in stereo images. In epipolar geometry, with homogeneous image coordinates mathbf{y 1} and mathbf{y 2} of corresponding points… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”