GroupLens Research has collected and made available rating data sets from the MovieLens web site (http://movielens.org). The data sets were collected over various periods of time, depending on the size of the set. Before using these data sets, please review their README files for the usage licenses and other details.
Help our research lab: Please take a short survey about the MovieLens datasets
recommended for new research
MovieLens 20M Dataset
Stable benchmark dataset. 20 million ratings and 465,000 tag applications applied to 27,000 movies by 138,000 users. Includes tag genome data with 12 million relevance scores across 1,100 tags. Released 4/2015; updated 10/2016 to update links.csv and add tag genome data.
Also see the MovieLens 20M YouTube Trailers Dataset for links between MovieLens movies and movie trailers hosted on YouTube.
recommended for education and development
MovieLens Latest Datasets
These datasets will change over time, and are not appropriate for reporting research results. We will keep the download links stable for automated downloads. We will not archive or make available previously released versions.
Small: 100,000 ratings and 1,300 tag applications applied to 9,000 movies by 700 users. Last updated 10/2016.
Full: 26,000,000 ratings and 750,000 tag applications applied to 45,000 movies by 270,000 users. Includes tag genome data with 12 million relevance scores across 1,100 tags. Last updated 8/2017.
MovieLens 100K Dataset
Stable benchmark dataset. 100,000 ratings from 1000 users on 1700 movies. Released 4/1998.
MovieLens 1M Dataset
Stable benchmark dataset. 1 million ratings from 6000 users on 4000 movies. Released 2/2003.
MovieLens 10M Dataset
Stable benchmark dataset. 10 million ratings and 100,000 tag applications applied to 10,000 movies by 72,000 users. Released 1/2009.
MovieLens Tag Genome Dataset
11 million computed tag-movie relevance scores from a pool of 1,100 tags applied to 10,000 movies. Released 3/2014.
Also consider using the MovieLens 20M or latest datasets, which also contain (more recent) tag genome data.