Luhn mod N algorithm

Luhn mod N algorithm

The Luhn mod N algorithm is an extension to the Luhn algorithm (also known as mod 10 algorithm) that allows it to work with sequences of non-numeric characters. This can be useful when a check digit is required to validate an identification string composed of letters, a combination of letters and digits or even any arbitrary set of characters.

Informal explanation

The Luhn mod N algorithm generates a check digit (more precisely, a check character) within the same range of valid characters as the input string. For example, if the algorithm is applied to a string of lower-case letters ("a" to "z"), the check character will also be a lower-case letter. Apart from this distinction, it resembles very closely the original algorithm.

The main idea behind the extension is that the full set of valid input characters is mapped to a list of code-points (i.e., sequential integers beginning with zero). The algorithm processes the input string by converting each character to its associated code-point and then performing the computations in mod N (where N is the number of valid input characters). Finally, the resulting check code-point is mapped back to obtain its corresponding check character.

Mapping characters to code-points

Initially, a mapping between valid input characters and code-points must be created. For example, consider that the valid characters are the lower-case letters from "a" to "f". Therefore, a suitable mapping would be:

Algorithm

Assuming the following functions are defined:

function int CodePointFromCharacter(char character) {...}

function char CharacterFromCodePoint(int codePoint) {...}

function int NumberOfValidInputCharacters() {...}

The function to generate a check character is:

function char GenerateCheckCharacter(string input) {

int factor = 2; int sum = 0; int n = NumberOfValidInputCharacters();

// Starting from the right and working leftwards is easier since // the initial "factor" will always be "2" for (int i = input.Length - 1; i >= 0; i--) { int codePoint = CodePointFromCharacter(input [i] ); int addend = factor * codePoint;

// Alternate the "factor" that each "codePoint" is multiplied by factor = (factor = 2) ? 1 : 2;

// Sum the digits of the "addend" as expressed in base "n" addend = (addend / n) + (addend % n); sum += addend; }

// Calculate the number that must be added to the "sum" // to make it divisible by "n" int remainder = sum % n; int checkCodePoint = n - remainder; checkCodePoint %= n;

return CharacterFromCodePoint(checkCodePoint);}

And the function to validate a string (with the check character as the last character) is:

function bool ValidateCheckCharacter(string input) {

int factor = 1; int sum = 0; int n = NumberOfValidInputCharacters();

// Starting from the right, work leftwards // Now, the initial "factor" will always be "1" // since the last character is the check character for (int i = input.Length - 1; i >= 0; i--) { int codePoint = CodePointFromCharacter(input [i] ); int addend = factor * codePoint;

// Alternate the "factor" that each "codePoint" is multiplied by factor = (factor = 2) ? 1 : 2;

// Sum the digits of the "addend" as expressed in base "n" addend = (addend / n) + (addend % n); sum += addend; }

int remainder = sum % n;

return (remainder = 0);}

Example

Generation

Consider the above set of valid input characters and the example input string "abcdef". To generate the check character, start with the last character in the string and move left doubling every other code-point. The "digits" of the code-points as written in base 6 (since there are 6 valid input characters) should then be summed up:

The total sum of digits is 14 (0 + 2 + 2 + 1 + 4 + 5). The number that must be added to obtain the next multiple of 6 (in this case, 18) is 4. This is the resulting check code-point. The associated check character is e.

Validation

The resulting string "abcdefe" can then be validated by using a similar procedure:

The total sum of digits is 18. Since it is divisible by 6, the check character is valid.

Implementation

The mapping of characters to code-points and back can be implemented in a number of ways. The simplest approach (akin to the original Luhn algorithm) is to use ASCII code arithmetic. For example, given an input set of "0" to "9", the code-point can be calculated by subtracting the ASCII code for '0' from the ASCII code of the desired character. The reverse operation will provide the reverse mapping. Additional ranges of characters can be dealt with by using conditional statements.

Non-sequential sets can be mapped both ways using a hard-coded "switch/case" statement. A more flexible approach is to use something similar to an Associative Array. For this to work, a pair of arrays is required to provide the two-way mapping.

An additional possibility is to use an array of characters where the array indexes are the code-points associated with each character. The mapping from character to code-point can then be performed with a linear or binary search. In this case, the reverse mapping is just a simple array lookup.

Weakness

This extension shares the same weakness as the original algorithm, namely, that it can't detect the transposition of the sequence "" to "" (or vice-versa). This is equivalent to the transposition of "09" to "90" (assuming a set of valid input characters from "0" to "9" in order). On a positive note, the larger the set of valid input characters, the smaller the impact of the weakness.

External links

* [http://modp.com/release/checkdigits/ Implementation in Java, JavaScript] at modp.com


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Luhn algorithm — The Luhn algorithm or Luhn formula, also known as the modulus 10 or mod 10 algorithm, is a simple checksum formula used to validate a variety of identification numbers, such as credit card numbers, IMEI numbers, National Provider Identifier… …   Wikipedia

  • Luhn-Formel — Der Luhn Algorithmus oder die Luhn Formel, auch bekannt als „Modulo 10“ oder „mod 10“ Algorithmus, wurde in den 1960er Jahren von Hans Peter Luhn als eine Methode der Überprüfung von Identifikationsnummern entwickelt. Er ist eine simple… …   Deutsch Wikipedia

  • Error detection and correction — In mathematics, computer science, telecommunication, and information theory, error detection and correction has great practical importance in maintaining data (information) integrity across noisy channels and less than reliable storage… …   Wikipedia

  • List of mathematics articles (L) — NOTOC L L (complexity) L BFGS L² cohomology L function L game L notation L system L theory L Analyse des Infiniment Petits pour l Intelligence des Lignes Courbes L Hôpital s rule L(R) La Géométrie Labeled graph Labelled enumeration theorem Lack… …   Wikipedia

  • List of algebraic coding theory topics — This is a list of algebraic coding theory topics. ARQ[disambiguation needed  ] Adler 32 BCH code BCJR algorithm Berger code Berlekamp Massey algo …   Wikipedia

  • Advanced Audio Coding — AAC redirects here. For other uses, see AAC (disambiguation). Advanced Audio Codings iTunes standard AAC file icon Filename extension .m4a, .m4b, .m4p, .m4v, .m4r, .3gp, .mp4, .aac Internet media type audio/aac, audio/aacp, au …   Wikipedia

  • MSI Barcode — for the number 1234567 with Mod 10 check digit MSI (also known as Modified Plessey) is a barcode symbology developed by the MSI Data Corporation, based on the original Plessey Code symbology. It is a continuous symbology that is not self checking …   Wikipedia

  • Check digit — A check digit is a form of redundancy check used for error detection, the decimal equivalent of a binary checksum. It consists of a single digit computed from the other digits in the message. With a check digit, one can detect simple errors in… …   Wikipedia

  • Hash function — A hash function is any well defined procedure or mathematical function for turning some kind of data into a relatively small integer, that may serve as an index into an array. The values returned by a hash function are called hash values, hash… …   Wikipedia

  • Luhnalgorithmus — Der Luhn Algorithmus oder die Luhn Formel, auch bekannt als „Modulo 10“ oder „mod 10“ Algorithmus, wurde in den 1960er Jahren von Hans Peter Luhn als eine Methode der Überprüfung von Identifikationsnummern entwickelt. Er ist eine simple… …   Deutsch Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”