- Ugly duckling theorem
The Ugly Duckling theorem is an
argument asserting that classification is impossible without some sort of bias. It is named forHans Christian Andersen 's famous story of "The Ugly Duckling ." It gets its name because it shows that, all things being equal, an ugly duckling is just as similar to aswan as two swans are to each other, although it is only atheorem in a very informal sense. It was proposed bySatoshi Watanabe in 1969. [cite book
last = Watanabe
first = Satosi
title = Knowing and Guessing: A Quantitative Study of Inference and Information
publisher = Wiley
year = 1969
location = New York
url=http://www.igm.hokudai.ac.jp/crg/download_files/watanabe.pdf
format=page scan
pages=pp. 376-377]Basic idea
Suppose there are "n" things in the universe, and we want to put them into classes or categories. We have no preconceived ideas or
bias es about what sorts of categories are "natural" or "normal" and what are not. So we have to consider all the possible classes that could be, all the possible ways of making sets out of the "n" objects. There are such ways, the size of the power set of "n" objects. Maybe we can use that to measure the similarity between two objects: we'll see how many sets they have in common. Whoops. We can't. Any two objects have exactly the same number of classes in common with one another, namely (half the total number of classes there are). To see this is so, you can imagine each class is a represented by an n-bit binary number , with a zero for each element not in the class and a one for each element in the class. As you can see, there are such numbers, which makes sense, of course.As all possible choices of zeros and ones are there, any two bit-positions will agree exactly half the time. If you have trouble seeing that, pick two elements and reorder the bits so they are the first two, and imagine the numbers sorted lexicographically. If you think about it, the first numbers will have bit #1 set to zero, and the second will have it set to one. Within each of those blocks, the top will have bit #2 set to zero and the other will have it as one... so they agree on two blocks of or on half of all the cases. No matter which two elements we pick! So if we have no preconceived bias about which categories are better, then by perfectly fair means of measuring, everything is equally similar (or equally dissimilar). The number of predicates simultaneously satisfied by two non-identical elements is constant over all such pairs. And is the same as the number of those satisfied by one. Thus, we need some kind of inductive bias to make such judgments; some way to prefer certain categories over others.
ee also
Notes
References
*
Wikimedia Foundation. 2010.