Crowdsourcing the Lizard People

By on

As a die-hard political junkie, I’ve been looking for something to
fill the void left by an abnormally tidy presidential race.  Luckily I
live in Minnesota, and get minute-by-minute updates about the Great
Minnesota Recount.
Projecting the winner for the recount is difficult:
Were challenged ballots type I errors or type II errors?  Exactly how
many challenged ballots were withdrawn by each candidate?  And would Al
Franken or Norm Coleman be a better representative of the political
views of The
Lizard People
.

The Star Tribune has enabled readers to vote on the outcomes for challenged ballots.  I
assumed that the reader votes were for entertainment purposes, but the
Star Tribune has cleverly analyzed two million reader votes to project the final outcomes for over 6,000
challenged ballots.

It is easy to imagine that these online votes are biased.  Online
users trend democratic, and Democrats may award more ballots to
Franken.  However, some anecdotal evidence hints that the Star
Tribune’s projections beat those from political experts.  During the
past two days, the Strib’s projection has hovered around a 75 vote lead
for Franken.  Meanwhile, projections from Nate Silver of FiveThirtyEight, a
highly respected voting analyst, have slowly been converging to the
Strib’s.

Maybe if we had crowdsourced the original counting of the ballots we wouldn’t be in this mess!

Python’s GIL is EVIL

By on

Lately I’ve been doing some Python multi-threading to make the best use of some of our amazing server resources. As I was pondering the reasons why one of our 8-core servers reported 83% idle despite 8 threads banging away, I re-discovered the Global Interpreter Lock.

BLECH!

The GIL enforces Python’s requirement that only a single bytecode operation is executed at a time. My nicely coded multi-threaded app was only being executed serially!! Sadly, this seems unlikely to change, even in Python 3000. Last year Guido said:

“Just Say No to the combined evils of locking, deadlocks, lock granularity, livelocks, nondeterminism and race conditions.”

I was brought up to believe that threading was dirty and independent communicating processes were the way to go. But even I realize that this just isn’t practical in these days of GUIs, multi-core processors, and application servers.

Why does the Python community accept the GIL? Is it because most people only use Python as a scripting language? Are there simple workarounds (e.g. not forking, shared memory, or the like) that I’m missing?

How do software patterns evolve?

By on

I was talking yesterday afternoon with several other lab members about Martin Fowler’s "Patterns of Enterprise Application Architecture." In his book, Fowler admits that most patterns aren’t anything new. "Creating" a software pattern is just naming and describing a software practice already used by some developers. Of course, Fowler presents patterns with amazing clarity and skill, so his contributions are valuable to the development community.

Since our research group is interested in social communities, we were curious about how new software practices evolve into well-known design patterns. Do most patterns start with a bang from a few highly influential, outspoken developers? Is there gradually increased adoption until the pattern reaches a critical mass of people? Another possibility is that as the global software environment changes, many groups of people stumble onto the same pattern at the same time.

What’s your opinion?

 

The Forgetting of Research Papers

By on

John Langford recently blogged about researchers preference to cite recent research. He calls this tendency "the forgetting" of prior work. John suggests a number of reasons recent work may be remembered (including "Dead men don't reject your papers for not citing them").

John also points out the obvious: forgetting is a bad thing. It may lead people to overestimate the value of a paper, and it reduces the efficiency of contribution. He wonders whether our line of research may be able to help: "Wouldn’t it be great if all the content at a conference was organized in a wikipedia-like easy-for-outsiders-to-understand style?"