- Smith normal form
The Smith normal form is a
normal form that can be defined for any matrix (not necessarily square) with entries in aprincipal ideal domain (PID). The Smith normal form of a matrix is diagonal, and can be obtained from the original matrix by multiplying on the left and right byinvertible square matrices. In particular, the integers are a PID, so one can always calculate the Smith normal form of an integer matrix. The Smith normal form is very useful for working with finitely generated modules over a PID, and in particular for deducing the structure of a quotient of afree module .Let "A" be a nonzero "m"×"n" matrix over a
principal ideal domain "R", and for "a" in "R" {0}, write δ("a") for the number of prime factors of "a" (these exist and are unique since any PID is also aunique factorization domain ). In particular, "R" is also aBézout domain , so it is agcd domain and the gcd of any two elements satisfies aBézout's identity .Algorithm
Our goal will be to find invertible square matrices "S" and "T" such that the product "S A T" is diagonal. This is the hardest part of the algorithm and once we have achieved diagonality it becomes relatively easy to put the matrix in Smith normal form. (Note that invertibility of a matrix with entries in "R" is the same as saying that its determinant is a unit.) Phrased more abstractly, the goal is to show that, thinking of "A" as a map from (the free "R"-module of rank "n") onto (the free "R"-module of rank "m"), there are isomorphisms and such that has the simple form of a
diagonal matrix . The matrices "S" and "T" will be found by repeatedly applying elementary transformations that replace a row (column) with a linear combination of itself and another row (column).Set "t" = 1 and choose "j""t" to be the smallest column index of "A" with a non-zero entry. One can repeatedly apply the following to put a matrix into Smith normal form.
Case I
If and , exchange rows and "k".
Case II
If there is an entry at position ("k","j""t") such that , then, letting , we know by the Bézout property that there exist σ, τ in "R" such that
:
By left-multiplication with an appropriate matrix "L" it can be achieved that row "t" of the matrix product is the sum of row "t" multiplied by σ and row "k" multiplied by -τ. (If σ and τ satisfy the above equation, they must be
relatively prime ; so there exist α and γ such that:
or in other words, the
determinant of the matrix:
equals one. "L" can be obtained by fitting this matrix into the diagonal of the identity matrix at the appropriate positions, depending on the value of "t" and "k". That "L" has determinant one guarantees that "L" is invertible over "R".) After left-multiplying by "L" we get β at position ("t","j""t"), where and β divides . Repeating these steps, one ends up with a matrix having an entry at position ("t","j""t") that divides all entries in column "j""t".
Case III
Finally, adding appropriate multiples of row "t", it can be achieved that all entries in column "j""t" except for that at position ("t","j""t") are zero. This can be achieved by left-multiplication with an appropriate matrix. However, to make the matrix fully diagonal we need to eliminate nonzero entries on the row of position ("t","j""t") as well. This can be achieved by repeating the steps in Case II for columns instead of rows, and using multiplication on the right. In general this will result in the zero entries from the prior application of Case II becoming nonzero again.
However, notice that the
ideals generated by the elements at position ("t","j""t") form anascending chain , because entries from a later step always divide entries from a previous step. Therefore, since "R" is aNoetherian ring (it is aPID ), the ideals eventually become stationary and do not change. This means that at some stage after Case II has been applied, the entry at ("t","j""t") will divide all nonzero row or column entries before applying any more steps in Case II. Then we can eliminate entries in the row or column with nonzero entries while preserving the zeros in the already-zero row or column. At this point, only the block of "A" to the lower right of ("t","j""t") needs to be diagonalized, and the algorithm can be applied recursively, treating this block as a separate matrix.Results
Applying the steps described above to the remaining non-zero columns of the resulting matrix (if any), we get an -matrix with column indices where , each of which satisfies the following:
* the entry at position is non-zero;
* all entries below and above position as well as entries left of are zero.Furthermore, all rows below the -th row are zero.
This is a version of the
Gauss algorithm for principal ideal domains which is usually described only for commutative fields.Now we can re-order the columns of this matrix so that elements on positions for are nonzero and for ; and all columns right of the -th column (if present) are zero. For short set for the element at position . has non-negative integer values; so is equivalent to being a unit of .
can either happen if and differ by a unit factor, or if they are relative prime. In the latter case one can add column to column (which doesn't change and then apply appropriate row manipulations to get .And for
Wikimedia Foundation. 2010.