- Board representation (chess)
computer chess, software developers must choose a data structureto represent chess positions. Several data structures exist, collectively known as board representations. [ [http://www.cis.uab.edu/hyatt/boardrep.html Hyatt-- full article] ] Chess engines often utilize more than one board representation at different times, for efficiency.
A full description of a chess position, i.e. the position "state", should contain the following elements:
* The location of each piece on the board
* Whose turn it is to move
* Status of the 50-move draw rule. The name of this is sometimes a bit confusing, as it is 50 moves by each player, and therefore 100 half-moves, or ply. For example, if the previous 80 half-moves passed without a capture or a pawn move, the fifty-move rule will kick in after another twenty half-moves.
* Whether either player is permanently disqualified to castle, both kingside and queenside.
* If an
en passantcapture is possible.
Board representation typically does not include the status of the
threefold repetitiondraw rule. To determine this rule, a complete history of the game from the last capture or pawn movement needs to be maintained, and so, is generally tracked in separate data structures.
Some of the very earliest chess programs were working with extremely limited amounts of memory, such that even the 64 memory locations required to represent a chess board was too much to spend. These early programs would instead maintain lists of the locations of the up to 16 black and white pieces. Piece lists are still used by many of today's programs in conjunction with a separate board representation structure to speed up access to the pieces. Instead of looping through 64 (or more) squares to find all of the pieces, piece lists give instant access to the pieces.
One of the simplest ways to represent a board is to create an 8x8 two-dimensional
array(or, equivalently, a 64 element one-dimensional array). Each array element would identify what piece or pawn occupied the given square, or alternatively, if the square is empty. A common encoding is to consider 0 as empty, positive as white, and negative as black, e.g., white pawn +1, black pawn -1, white knight +2, black knight -2, white bishop +3, and so on.
A problem with this approach arises during move generation. Each move has to be checked to ensure it is on the board, significantly slowing down the process.
One solution is to use a 12x12 array instead, with the outer edges filled with, say, the value 99. During move generation, the operation to check for a piece on the destination square will also indicate whether the destination square is off the board.
Better memory usage can be achieved with a 10x12 array, which provides the same functionalities as a 12x12 one by overlapping the leftmost and rightmost edge files (which are marked as off-the board). Some chess engines use 16x16 arrays to improve the speed of the rank and file number conversion and allow some special coding tricks for attacks etc.
Machine code programmers have another alternative. Using 4 bits per square, an entire row can be represented in 32 bits, and the board in 8 registers (with an additional one for remaining position information). By use of a jump table, adding the piece value to the program counter can go directly to the code to generate moves for this type of piece on this square. Although the program is longer than for a conventional move generation methods, no checks for the edge of the board are required, and no moves off the board are considered, increasing move generation speed.
The 0x88 method takes advantage of the fact that a chessboard's 8x8 dimensions are an even power of two. The board uses an array of size 16x8 = 128, numbered 0 to 127. It is basically two boards next to each other, the actual board on the left. The binary layout of the rank and file of a legal board coordinate is [0 r r r 0 f f f] . When generating moves from the main board, one can check that a destination square is on the main board simply by ANDing the square number with
hexadecimal0x88 (binary [1 0 0 0 1 0 0 0] ). A non-zero result indicates that the square is off the main board. In addition, the difference between two squares' coordinates uniquely determines whether those two squares are along the same row, column, or diagonal (a common query used for determining check). [ [http://www.seanet.com/~brucemo/topics/0x88.htm 0x88 method. Bruce Moreland] ]
One commonly used board representation is the
bitboard. A bitboard is a 64-bit sequence of bits (0 or 1), which indicates the absence or presence (false or true) of some state about each place on the board. A board position can then be represented using a series of bitboards. For example, a series of bitboards for each piece type or pawn, for each side, can represent the board position.
The advantage to this representation is the ability to use
bit paralleloperations upon the 64-bitentities instead of iteration to manipulate and derive information about the state of the board. This makes maximal use of the hardware available, especially as once exotic 64-bitprocessors become mainstream.
As in array encoding, four bits suffice to encode the twelve different pieces. This leaves three combinations for encoding one to three empty squares and one combination for saying that the next four bits encode four to 19 empty squares. This allows encoding the starting position in 18 bytes. As pieces get taken, the encoding becomes ever shorter. The maximal penalty comes for four square gaps, which require eight bits. But there can be at most 13 of them, taking 20 bytes for such a board. A hypothetical board alternating a piece and a gap takes the maximum length of 32 bytes. To this must be added two bytes for the 50-move rule and such things.
Huffman coding, chess positions can be represented with bit patterns in such a way that more common board elements (empty squares and pawns) are stored with less bits than less common board elements:
Empty square = 0 Pawn = 10c Bishop = 1100c Knight = 1101c Rook = 1110c Queen = 11110c King = 11111c Where c is a bit representing the color of the piece (1 = LIGHT, 0 = DARK). Additional bits are needed for: 50-move rule (7 bits) en-passant column (4 bits) color to move (1 bit) castling rights (4 bits).
Trailing empty squares can be omitted. If the last piece is a king, it can by definition be omitted without loss of information, saving six bits. The
en passantcolumn is only needed when an opportunity seems to exist. Castlingrights need only be stored for those rooks for whom it seems permissible.
Huffman encoding schemes allow a complete board state (starting position) to be represented in just 23 bytes (actually 22 and 4 bits extra for openings with en passant chances).
Huffman encodings are fairly CPU intensive, in comparison to other board representations which seek to minimize required processor and memory cycles. However, the small size of the final representation makes this approach well suited to storage of long term knowledge, for instance in storing positions in an
opening book, where minimizing the board representation's size is more important than minimizing CPU cycles. Huffman encoded boards are also sometimes used in transposition tables for shallow entries.
An even more compact variant sacrifices the two to four bit representation of as many empty squares for encoding two to nine empty squares in five bits. If another zero follows, the number is extended by two bits, allowing two to 33 empty squares to be encoded in eight bits.
Empty square = 0 n+2 Empty squares = 00nnn n+2 Empty squares = 00nnn0nn
For up to 33-length gaps, like the initial board, this saves up to 25 bits. For all sparse boards where there are pieces or pawns at near nine-square intervals or much longer gaps, there are also nice savings, e.g. 16 bits for four gaps of length nine. This is offset by the extra bits for short gaps. To always have the optimal encoding, one extra bit can mark whether straightforward or compact empty square encoding is used.
In addition the coordinates of the kings can be stored and their fields ignored in the bit sequence instead. This also takes 6 bits each. But the queen can then be coded in five bits. And gaps to the right and left can then be coalesced, making for one longer gap which is often a candidate for better compacting.
Four extra bits can store the number of pieces of the first color encountered. After the last piece of that color, the color bit can be skipped. For the initial boards, this saves 12 bits.
Another extra bit appears when the number of pieces of the first color is eight or less. It can mark the fact that no pawns are left. In this case the first bit of all pieces can be skipped:
Bishop = 100c Knight = 101c Rook = 110c Queen = 1110c King = 1111c
Forsyth-Edwards Notation (FEN)
FEN notation is a board representation that was popularized before the advent of computers and is a
human-readablemethod of recording a chess game position in a single line of text. FEN is still used frequently by chess programs for saving board positions to external storage, whenever a human may view that storage. It is also sometimes used as a communication standard between chess software packages; the Universal Chess Interface, for example, specifies the FEN notation for transmitting board positions to a chess engine.
Here is the FEN for the starting position:
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
Here is the FEN after the move 1. e4:
rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 0 1
And then after 1. ... c5:
rnbqkbnr/pp1ppppp/8/2p5/4P3/8/PPPP1PPP/RNBQKBNR w KQkq c6 0 2
Wikimedia Foundation. 2010.