- Index calculus algorithm
In
group theory , the index calculus algorithm is analgorithm for computingdiscrete logarithm s. This is the best known algorithm for certain groups, such as mathbb{Z}_m^* (the multiplicative group modulo "m").Dubious|date=April 2008Description
Roughly speaking, the discrete log problem asks us to find an "x" such that g^x equiv h pmod{n}, where "g", "h", and the modulus "n" are given.
The algorithm (described in detail below) applies to the group mathbb{Z}_q^* where "q" is prime. It requires a "factor base" as input. This "factor base" is usually chosen to be the number −1 and the first "r" primes starting with 2. From the point of view of efficiency, we want this factor base to be small, but in order to solve the discrete log for a large group we require the "factor base" to be (relatively) large. In practical implementations of the algorithm, those conflicting objectives are compromised one way or another.
It is noteworthy that the lack of the notion of "prime elements" in the group of points on
elliptic curves , makes it impossible to find an efficient "factor base" to run index calculus in these groups. Therefore this algorithm is incapable of solving discrete logarithms efficiently in elliptic curve groups.The algorithm is performed in three stages. The first two stages depend only on the generator "g" and prime modulus "q", and find the discrete logarithms of a "factor base" of "r" small primes. The third stage finds the discrete log of the desired number "h" in terms of the discrete logs of the factor base.
The first stage consists of searching for a set of "r"
linearly independent "relations" between the factor base and power of the generator "g". Each relation contributes one equation to asystem of linear equations in "r" unknowns, namely the discrete logarithms of the "r" primes in the factor base. This stage isembarrassingly parallel and easy to divide among many computers.The second stage solves the system of linear equations to compute the discrete logs of the factor base. Although a relatively minor computation compared to the other stages, a system of hundreds of thousands or millions of equations is a significant computation requiring large amounts of memory, and it is "not" embarrassingly parallel, so a
supercomputer is typically used.The third stage searches for a power "s" of the generator "g" which, when multiplied by the argument "h", may be factored in terms of the factor base "gsh" = (−1)"f"0 2"f"1 3"f"2···"p""r""f""r".
Finally, in an operation too simple to really be called a fourth stage, the results of the second and third stages can be rearranged by simple algebraic manipulation to work out the desired discrete logarithm "x" = "f"0log"g"(−1) + "f"1log"g"2 + "f"2log"g"3 + ··· + "f""r"log"g""pr" − "s".
The first and third stages are both embarrassingly parallel, and in fact the third stage does not depend on the results of the first two stages, so it may be done in parallel with them.
The choice of the factor base size "r" is critical, and the details are too intricate to explain here. The larger the factor base, the easier it is to find relations in stage 1, and the easier it is to complete stage 3, but the more relations you need before you can proceed to stage 2, and the more difficult stage 2 is. The relative availability of computers suitable for the different types of computation required for stages 1 and 2 is also important.
The algorithm
Input: Discrete logarithm generator "g", modulus "q" and argument "h". Factor base {−1,2,3,5,7,11,...,"pr"}, of length "r"+1. Output: "x" such that "gx" ≡ "h" (mod "q").
* relations ← empty_list
* for "k" = 1, 2, ...
** Using aninteger factorization algorithm optimized forsmooth numbers , try to factor g^k using the factor base, i.e. find e_i's such that g^k = (-1)^{e_0}2^{e_1}3^{e_2}cdots p_r^{e_r}
** Each time a factorization is found:
*** Store "k" and the computed e_i's as a vector e_0,e_1,e_2,ldots,e_r,k) (this is a called a relation)
*** If this relation islinearly independent to the other relations:
**** Add it to the list of relations
**** If there are at least "r"+1 relations, exit loop
* Form a matrix whose rows are the relations
* Obtain thereduced echelon form of the matrix
** The first element in the last column is the discrete log of −1 and the second element is the discrete log of 2 and so on
* for "s" = 0, 1, 2, ...
** Try to factor g^s h = (-1)^{f_0}2^{f_1}3^{f_1}cdots p^{f_r} over the factor base
** When a factorization is found:
*** Output x = f_0 log_g(-1) + f_1 log_g2 + cdots + f_r log_g p_r - s.External links
* [http://www.dtc.umn.edu/~odlyzko/doc/arch/discrete.logs.pdf Discrete logarithms in finite fields and their cryptographic significance] , by
Andrew Odlyzko
* [http://www.cs.toronto.edu/~cvs/dlog/ Discrete Logarithm Problem] , by Chris Studholme, including the June 21, 2002 paper "The Discrete Log Problem".
Wikimedia Foundation. 2010.