Tag Genome Dataset Released

By on

Want to know how quirky a particular movie is? Or how to find the most visually appealing movies of all time? Or how to find a movie that is similar to another movie you’ve seen but less big budget and more cerebral?

The tag genome is a data structure that enables you to answer queries such as these. As described in this article, the tag genome encodes how strongly movies exhibit particular properties represented by tags (atmospheric, thought-provoking, realistic, etc.). The tag genome was computed using a machine learning algorithm on user-contributed content including tags, ratings, and textual reviews.

We’re announcing the release of a tag genome dataset, containing the relevance values for 1,128 tags and 9,734 movies. We hope you will explore this dataset and come up with new and creative ways to use it! You can find more details here.