GroupLens

Is your mind open?

By terveen on October 8, 2007

A recent article in the NY Times "Revisiting the Canon Wars" (http://www.nytimes.com/2007/09/16/books/review/Donadio-t.html) took a 20 year retrospective look at the controversy over Alan Bloom’s book "The Closing of the American Mind". Bloom argued that American universities had been "dumbed down" by abandoning the classical Western canon.

Lots to argue about here, and the article gives a taste of the argument. However, I was most struck by the way the article ended:

Bloom believed education should be transformative… that it should provide a student with “four years of freedom”
— “a space between the intellectual wasteland he has left behind and
the inevitable dreary professional training that awaits him after the
baccalaureate.” Whether students today see college as a time of freedom
or a compulsory phase of credentialing is an open question. From
Bloom’s perspective, “the importance of these years for an American
cannot be overestimated. They are civilization’s only chance to get to
him.”

Wow, that was bracing and brutal! I’m curious what others think about it.

Switching to GMail

By riedl on October 8, 2007

After much debate I finally decide to switch to Gmail from Thunderbird. The three features that decided me on the switch were:

1) Labels. I’ve been using folders for two decades now. Folders are clearly a browse feature. Labels are (can be?) a search feature. I wanted to experiment with search-based email.

2) A unified address book. Since Gmail is web-based, I no longer have the problem of having my address book be different on every client.

3) Fast startup. I keep a big inbox (just over 6000 messages right now), and the IMAP startup overhead was getting me down.

So far the switch is okay. I’m using the *essential* Gmail macros package so I can enter labels from the keyboard. (I use the version by Brent Nef, which has the wonderful feature of being able to create a new label with the "N" key.) The keybindings Google has chosen are weird, and it’s killing me not to be able to rebind them. The two biggest pains, though, are:

1) Having to hit an extra "x" character to select the message header the caret is on before I can operate on that message. If no other message is selected isn’t it obvious I intend to delete the message the caret is on?!

2) Importing all my old mbox files. I’ve found a couple of programs (Gmail Loader and gExodus) that offer to help with the import. Their solution is a brutal hack: they email the messages one at a time to Gmail. Years ago this was not just a hack, it was a disaster: the messages would update their timestamps when Gmail received them. This problem is improved now, since a diligent user can email them to a secondary Gmail account, and then slurp them into the primary account using Gmail-to-Gmail import. Rumor has it that the import will reset the timestamps.

The problem for me is that the messages will import without labels. Ugh! gExodus has a hack solution: the program will prepend arbitrary text in the subject line. In principle, a user might then write a filter for each incoming mbox file that would assign the appropriate Gmail label to it according to the prepended text. (E.g., [Riedl:Folder:Foo] might be labelled "Foo".) Unfortunately, I have 390 folders (two decades, remember), and as far as I can tell the filters would have to be written separately for each folder (no wildcarding in Gmail filters).

Another possible solution would be to use one of the Gmail APIs. For instance, this Java API is sufficient to write a nearly full-featured desktop email client that uses Gmail as its backend. There are a number of similar Python APIs. Sadly, none of these are support by Gmail, so using them might be risky. Even more sadly, none of the ones I’ve found support assigning labels through the API. Otherwise, I could use the mail trick in gExodus, and have a separate program use the API to assign the labels.

Any ideas?

John

Wikipedia implementing trust ratings for authors

By terveen on October 1, 2007

Wikipedia is going to implement Alfaro et al’s algorithm to assign trust levels to individual chunks of text within articles based on the reputation of the author of the chunk. The interface will use color coding to visualize trust levels.

http://technology.newscientist.com/channel/tech/mg19526226.200-wikipedia-20–now-with-added-trust.html

Wikipedia: Quality over Quantity?

By riedl on September 29, 2007

Interesting article in the LA Times about Wikipedia and the battle over quality versus quantity. The heart of the story is the frenzy over Jimmy Wales creating an article about a little barbecue restaurant he likes. The article was deleted as "not notable". Wild discussion ensued, with many arguing that the article would have been deleted without question if it hadn’t been created by Wales.

The interesting angle to me is the question about why something should be deleted from Wikipedia if it is accurate and interesting to some people. There’s a good information theory argument that if people are mostly browsing to find information it’s important to avoid having too much: even log growth eventually becomes too much. However, it seems to me that most of the finding of information on Wikipedia is through search. In this case, most of the growth of the indexes is Google’s problem: the rest of us never notice most of the stuff on Wikipedia.

A weakness in this argument is that as the index space becomes polluted with references to the irrelevant, successful searches will require more keywords to be sufficiently selective. In effect, the change from browse to search may have little information theoretic difference on usability: in browse I click more, while in search I type more.

I wonder if these ideas can be formalized and tested? What would be a good test-bed?

John

Slide.com vs. Flickr

By riedl on September 28, 2007

ReadWriteWeb has an interesting entry that says that photo sharing site Slide.com has moved into second place behind only Flickr — in New Zealand, anyway.

One interesting nuggest is that Slide.com gets 59% of its traffic from a collection of small Facebook applications they’ve created. These applications are being widely adopted on Facebook as a great way to share pictures. Clearly Facebook’s application strategy is paying dividends. Less clear is how this strategy is going to play out for Slide.com, which reports that their average time per visit on the site is much lower than other photo sharing sites. On the one hand, this may be a great thing for the visitors: they are getting more value for less of their precious attention. On the other hand, is Slide.com going to get the benefits of all those users, or is Facebook?

Another interesting nugget is that Slide.com is making so much progress in New Zealand — but apparently not everywhere. Will photo sharing be another domain, like social networking, in which geography determines use? There are network effects in photo sharing, since it’s more convenient to be in the same network as the people I like to share photos with, though the network effects should be less powerful than in social networking, since I can still share photos with you even if you’re in a different network.

An interesting research project would be to track the evolution of these geography effects across time. One hypothesis is that people are always going to have strong ties to people they’re physically adjacent to, in which case these geography effects will be enduring. Another hypothesis is that heavy net users will tend to have relationships that are more independent of geography, in which case these geography effects will decay in importance with time.

Any predictions?

John

Blog