- Schlemiel the painter's Algorithm
In software development, a Schlemiel the Painter ['s] algorithm denotes any methodology that is inefficient because the programmer has overlooked some fundamental issues at the very lowest levels of
software design. The term was coined by software engineer and essayist Joel Spolsky.
Spolsky used a
Yiddishjoke to illustrate a certain poor programming practice. In the joke, Schlemiel (also rendered Shlemiel) has a job painting the dotted lines down the middle of a road. Each day, Schlemiel paints less than he painted the day before. When asked how that could possibly be, Schlemiel complains that it is because each day he gets further away from the paint can.citation|last=Spolsky|first=Joel|title=Back to Basics|date=December 11, 2001|series=Joel on Software|url=http://www.joelonsoftware.com/articles/fog0000000319.html|publisher=joelonsoftware.com.]
The inefficiency Spolsky was drawing an analogy to was the poor programming practice of repeated
concatenationof C-style null terminated character arrays (in general computing parlance known as "strings")— in which the length of the destination string has to be recomputed each time because it is not carried over from a previous concatenation.
Spolsky condemned such inefficiencies as typical for programmers who had not been taught basic programming techniques before they began programming using higher level languages: "Generations of graduates are descending on us and creating "Shlemiel The Painter algorithms" right and left and they don't even realize it, since they fundamentally have no idea that strings are, at a very deep level, difficult."
Coined in 2001, the term has since becoming part of the vernacular to denote inefficient programming techniques."cf." [citation|title=Programming interview questions|last=Cox|first=William|date=November 19, 2005|url=http://discuss.techinterview.org/default.asp?interview.11.246942.7|publisher=techinterview.org.] [citation|last=Atwood|first=Jeff|date=September 19, 2007|title=Everything Is Fast For Small n|url=http://www.codinghorror.com/blog/archives/000957.html|publisher=codinghorror.com.] Spolsky's essays have been cited as examples of good writing "about their insular world in a way that wins the respect of their colleagues and the attention of outsiders." [citation|last=Rosenberg|first=Scott|title=The Shlemiel way of software|date=December 9, 2004|url=http://dir.salon.com/story/tech/feature/2004/12/09/spolsky/|publisher=salon.com.]
The programming practice that Spolsky used to make his point was repeated
concatenationof C-style character arrays ("strings").
The first step in every implementation of the standard C library function for concatenating strings is a determination of the length of the string being appended to. In subsequent steps, another string is then copied to the end of the first string, so effectively concatenating the two. At the end of the concatenation the combined length will be known, but will be discarded upon return to the calling code.
In Spolsky's example the "Schlemiels" occur when multiple strings are being concatenated together:
/* Here, the string "John" is appended to the buffer */
strcat( buffer, "John" );
strcat( buffer, "Paul" );/* Now the string "Paul" is appended to "that" */
strcat( buffer, "George" );/* ... and the string "George" is appended to "that" */
strcat( buffer, "Ringo" );/* ... and the string "Ringo" is appended to "that" */By the end of the first invocation of
strcat()the length of
bufferwill be known but not preserved upon return to the point just after step 1 and just before step 2. Subsequent calls to
strcat()have to compute that length again before they concatenating another name to the
As such, and analogous to Schmiel's not carrying the paint-bucket with him, all the subsequent
strcat()s have to again "walk" the length of the string to determine where the second string should be copied. As more data is added to it, the string in
bufferalso gets longer with each call to
strcat(), and with increasing length, the determination of that length takes longer, which means that subsequent calls are increasingly slower. Just as "Schlemiel's" path to his bucket keeps getting longer.
The problems illustrated by Spolsky's example remain hidden to a programmer with little or no knowledge of the underlying principles and functions, which every higher-level language and library will still be using even when the low-level manipulation is not immediately obvious at the higher level. "Some of the biggest mistakes people make even at the highest architectural levels come from having a weak or broken understanding of a few simple things at the very lowest levels."
Wikimedia Foundation. 2010.