Interesting article in the LA Times about Wikipedia and the battle over quality versus quantity.  The heart of the story is the frenzy over Jimmy Wales creating an article about a little barbecue restaurant he likes.  The article was deleted as "not notable".  Wild discussion ensued, with many arguing that the article would have been deleted without question if it hadn’t been created by Wales.

The interesting angle to me is the question about why something should be deleted from Wikipedia if it is accurate and interesting to some people.  There’s a good information theory argument that if people are mostly browsing to find information it’s important to avoid having too much: even log growth eventually becomes too much.  However, it seems to me that most of the finding of information on Wikipedia is through search.  In this case, most of the growth of the indexes is Google’s problem: the rest of us never notice most of the stuff on Wikipedia.

A weakness in this argument is that as the index space becomes polluted with references to the irrelevant, successful searches will require more keywords to be sufficiently selective.  In effect, the change from browse to search may have little information theoretic difference on usability: in browse I click more, while in search I type more.

I wonder if these ideas can be formalized and tested?  What would be a good test-bed?


