Generalized minimal residual method

Generalized minimal residual method

In mathematics, the generalized minimal residual method (usually abbreviated GMRES) is an iterative method for the numerical solution of a system of linear equations. The method approximates the solution by the vector in a Krylov subspace with minimal residual. The Arnoldi iteration is used to find this vector.

The GMRES method was developed by Yousef Saad and Martin H. Schultz in 1986.[1] GMRES is a special case of the DIIS method developed by Peter Pulay in 1980. The latter method is also applicable to non-linear systems.

Contents

The method

Denote the Euclidean norm of any vector v by ||v||. Denote the system of linear equations to be solved by

 Ax = b. \,

The matrix A is assumed to be invertible of size m-by-m. Furthermore, it is assumed that b is normalized, i.e., that ||b|| = 1.

The nth Krylov subspace for this problem is

 K_n = \operatorname{span} \, \{ r, Ar, A^2r, \ldots, A^{n-1}r \}. \,

GMRES approximates the exact solution of Ax = b by the vector xnKn that minimizes the Euclidean norm of the residual Axnb.

The vectors r, Ar, …, An−1r might be almost linearly dependent, so instead of this basis, the Arnoldi iteration is used to find orthonormal vectors

 q_1, q_2, \ldots, q_n \,

which form a basis for Kn. Hence, the vector xnKn can be written as xn = Qnyn with ynRn, where Qn is the m-by-n matrix formed by q1, …, qn.

The Arnoldi process also produces an (n+1)-by-n upper Hessenberg matrix \tilde{H}_n with

 AQ_n = Q_{n+1} \tilde{H}_n. \,

Because Qn is orthogonal, we have

 \| Ax_n - b \| = \| \tilde{H}_ny_n - \beta e_1 \|, \,

where

 e_1 = (1,0,0,\ldots,0) \,

is the first vector in the standard basis of Rn+1, and

 \beta = \|b-Ax_0\| \, ,

x0 being the first trial vector (usually zero). Hence, xn can be found by minimizing the Euclidean norm of the residual

 r_n = \tilde{H}_n y_n - \beta e_1.

This is a linear least squares problem of size n.

This yields the GMRES method. At every step of the iteration:

  1. do one step of the Arnoldi method;
  2. find the yn which minimizes ||rn||;
  3. compute xn = Qnyn;
  4. repeat if the residual is not yet small enough.

At every iteration, a matrix-vector product Aqn must be computed. This costs about 2m2 floating-point operations for general dense matrices of size m, but the cost can decrease to O(m) for sparse matrices. In addition to the matrix-vector product, O(n m) floating-point operations must be computed at the nth iteration.

Convergence

The nth iterate minimizes the residual in the Krylov subspace Kn. Since every subspace is contained in the next subspace, the residual decreases monotonically. After m iterations, where m is the size of the matrix A, the Krylov space Km is the whole of Rm and hence the GMRES method arrives at the exact solution. However, the idea is that after a small number of iterations (relative to m), the vector xn is already a good approximation to the exact solution.

This does not happen in general. Indeed, a theorem of Greenbaum, Pták and Strakoš states that for every monotonically decreasing sequence a1, …, am−1, am = 0, one can find a matrix A such that the ||rn|| = an for all n, where rn is the residual defined above. In particular, it is possible to find a matrix for which the residual stays constant for m − 1 iterations, and only drops to zero at the last iteration.

In practice, though, GMRES often performs well. This can be proven in specific situations. If A is positive definite, then

 \|r_n\| \leq \left( 1-\frac{\lambda_{\mathrm{min}}(A^T + A)}{2 \lambda_{\mathrm{max}}(A^T + A)} \right)^{n/2} \|r_0\|,

where λmin(M) and λmax(M) denote the smallest and largest eigenvalue of the matrix M, respectively.

If A is symmetric and positive definite, then we even have

 \|r_n\| \leq \left( \frac{\kappa_2^2(A)-1}{\kappa_2^2(A)} \right)^{n/2} \|r_0\|.

where κ2(A) denotes the condition number of A in the Euclidean norm.

In the general case, where A is not positive definite, we have

 \|r_n\| \le \inf_{p \in P_n} \|p_n(A)\| \le \kappa_2(V) \inf_{p \in P_n} \max_{\lambda \in \sigma(A)} |p(\lambda)|, \,

where Pn denotes the set of polynomials of degree at most n with p(0) = 1, V is the matrix appearing in the spectral decomposition of A, and σ(A) is the spectrum of A. Roughly speaking, this says that fast convergence occurs when the eigenvalues of A are clustered away from the origin and A is not too far from normality.[2]

All these inequalities bound only the residuals instead of the actual error, that is, the distance between the current iterate xn and the exact solution.

Extensions of the method

Like other iterative methods, GMRES is usually combined with a preconditioning method in order to speed up convergence.

The cost of the iterations grow like O(n2), where n is the iteration number. Therefore, the method is sometimes restarted after a number, say k, of iterations, with xk as initial guess. The resulting method is called GMRES(k).

Comparison with other solvers

The Arnoldi iteration reduces to the Lanczos iteration for symmetric matrices. The corresponding Krylov subspace method is the minimal residual method (MinRes) of Paige and Saunders. Unlike the unsymmetric case, the MinRes method is given by a three-term recurrence relation. It can be shown that there is no Krylov subspace method for general matrices, which is given by a short recurrence relation and yet minimizes the norms of the residuals, as GMRES does.

Another class of methods builds on the unsymmetric Lanczos iteration, in particular the BiCG method. These use a three-term recurrence relation, but they do not attain the minimum residual, and hence the residual does not decrease monotonically for these methods. Convergence is not even guaranteed.

The third class is formed by methods like CGS and BiCGSTAB. These also work with a three-term recurrence relation (hence, without optimality) and they can even terminate prematurely without achieving convergence. The idea behind these methods is to choose the generating polynomials of the iteration sequence suitably.

None of these three classes is the best for all matrices; there are always examples in which one class outperforms the other. Therefore, multiple solvers are tried in practice to see which one is the best for a given problem.

Solving the least squares problem

One part of the GMRES method is to find the vector yn which minimizes

 \| \tilde{H}_n y_n - e_1 \|. \,

Note that \tilde{H}_n is an (n+1)-by-n matrix, hence it gives an over-constrained linear system of n+1 equations for n unknowns.

The minimum can be computed using a QR decomposition: find an (n+1)-by-(n+1) orthogonal matrix Ωn and an (n+1)-by-n upper triangular matrix \tilde{R}_n such that

 \Omega_n \tilde{H}_n = \tilde{R}_n.

The triangular matrix has one more row than it has columns, so its bottom row consists of zero. Hence, it can be decomposed as

 \tilde{R}_n = \begin{bmatrix} R_n \\ 0 \end{bmatrix},

where Rn is an n-by-n (thus square) triangular matrix.

The QR decomposition can be updated cheaply from one iteration to the next, because the Hessenberg matrices differ only by a row of zeros and a column:

\tilde{H}_{n+1} = \begin{bmatrix} \tilde{H}_n & h_n \\ 0 & h_{n+1,n} \end{bmatrix},

where hn = (h1n, … hnn)T. This implies that premultiplying the Hessenberg matrix with Ωn, augmented with zeroes and a row with multiplicative identity, yields almost a triangular matrix:

 \begin{bmatrix} \Omega_n & 0 \\ 0 & 1 \end{bmatrix} \tilde{H}_{n+1} = \begin{bmatrix} R_n & r_k \\ 0 & \rho \\ 0 & \sigma \end{bmatrix}

This would be triangular if σ is zero. To remedy this, one needs the Givens rotation

 G_n = \begin{bmatrix} I_{n-1} & 0 & 0 \\ 0 & c_n & s_n \\ 0 & -s_n & c_n \end{bmatrix}

where

 c_n = \frac{\rho}{\sqrt{\rho^2+\sigma^2}} \quad\mbox{and}\quad s_n = \frac{\sigma}{\sqrt{\rho^2+\sigma^2}}.

With this Givens rotation, we form

 \Omega_{n+1} = G_n \begin{bmatrix} \Omega_n & 0 \\ 0 & 1 \end{bmatrix}.

Indeed,

 \Omega_{n+1} \tilde{H}_{n+1} = \begin{bmatrix} R_n & r_n \\ 0 & r_{nn} \\ 0 & 0 \end{bmatrix} \quad\text{with}\quad r_{nn} = \sqrt{\rho^2+\sigma^2}

is a triangular matrix.

Given the QR decomposition, the minimization problem is easily solved by noting that

 \| \tilde{H}_n y_n - e_1 \| = \| \Omega_n (\tilde{H}_n y_n - e_1) \| = \| \tilde{R}_n y_n - \Omega_n e_1 \|.

Denoting the vector Ωne1 by

 \tilde{g}_n = \begin{bmatrix} g_n \\ \gamma_n \end{bmatrix}

with gnRn and γnR, this is

 \| \tilde{H}_n y_n - e_1 \| = \| \tilde{R}_n y_n - \Omega_n e_1 \| = \left\| \begin{bmatrix} R_n \\ 0 \end{bmatrix} y - \begin{bmatrix} g_n \\ \gamma_n \end{bmatrix} \right\|.

The vector y that minimizes this expression is given by

 y_n = R_n^{-1} g_n.

Again, the vectors gn are easy to update.[3]

See also

Notes

  1. ^ Saad and Schultz
  2. ^ Trefethen & Bau, Thm 35.2
  3. ^ Stoer and Bulirsch, §8.7.2

References


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Residual (mathematics) — The word residual is used in a number of different senses in mathematics. See:* Errors and residuals in statistics * Residual (numerical analysis) ** Minimal residual method (MINRES) ** Generalized minimal residual method (GMRES) * Residual set,… …   Wikipedia

  • Residual (numerical analysis) — Loosely speaking, a residual is the error in a result. To be precise, suppose we want to find x such that : f(x)=b.,Given an approximation of x 0 of x , the residual is : b f(x 0),whereas the error is: x 0 x.,If we do not know x , we cannot… …   Wikipedia

  • Iterative method — In computational mathematics, an iterative method is a mathematical procedure that generates a sequence of improving approximate solutions for a class of problems. A specific implementation of an iterative method, including the termination… …   Wikipedia

  • List of numerical analysis topics — This is a list of numerical analysis topics, by Wikipedia page. Contents 1 General 2 Error 3 Elementary and special functions 4 Numerical linear algebra …   Wikipedia

  • List of mathematics articles (G) — NOTOC G G₂ G delta space G networks Gδ set G structure G test G127 G2 manifold G2 structure Gabor atom Gabor filter Gabor transform Gabor Wigner transform Gabow s algorithm Gabriel graph Gabriel s Horn Gain graph Gain group Galerkin method… …   Wikipedia

  • GMRES — Das GMRES Verfahren (für Generalized minimal residual method) ist ein iteratives numerisches Verfahren zur Lösung großer, dünnbesetzter linearer Gleichungssysteme. Das Verfahren ist aus der Klasse der Krylow Unterraum Verfahren und insbesondere… …   Deutsch Wikipedia

  • GMRES-Verfahren — Das GMRES Verfahren (für Generalized minimal residual method) ist ein iteratives numerisches Verfahren zur Lösung großer, dünnbesetzter linearer Gleichungssysteme. Das Verfahren ist aus der Klasse der Krylow Unterraum Verfahren und insbesondere… …   Deutsch Wikipedia

  • GMRES — En mathématique, la généralisation de la Méthode de Minimisation du Résidu (ou GMRES) est une méthode itérative pour déterminer une solution numérique d un système d équations linéaires. La méthode donne un approximation de la solution par un… …   Wikipédia en Français

  • Arnoldi iteration — In numerical linear algebra, the Arnoldi iteration is an eigenvalue algorithm and an important example of iterative methods. Arnoldi finds the eigenvalues of general (possibly non Hermitian) matrices; an analogous method for Hermitian matrices is …   Wikipedia

  • Chebyshev iteration — In numerical linear algebra, the Chebyshev iteration is an iterative method for determining the solutions of a system of linear equations. The method is named after Russian mathematician Pafnuty Chebyshev. Chebyshev iteration avoids the… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”