The MovieLens datasets are full of data describing how people rate movies. As it turns out, these datasets have been useful to lots of folks, from recommender systems researchers to the readers of popular-press programming books. Though it is difficult to measure the full extent of the datasets’ impact, we see that they were downloaded more than 140,000 times in 2014, and that the keyword “movielens” currently results in over 8,900 results in Google Scholar.
It is tempting to view these collections of ratings as a cohesive whole. However, the truth of the matter is that the datasets are the product of 17 years of member activity in a web site that has seen its fair share of changes and experimental features. Given the extent of attention — research and otherwise — given to these datasets, it seems worth exploring the relationship between the system and the resulting data.
(more…)