Evaluating an External Recommender written in Python using LensKit

In this article, we show how to use LensKit to evaluate a recommender written in Python. We wrote this article to help people who want to use LensKit’s built-in evaluation capabilities and comparison algorithms, but don’t want to implement their own algorithms in Java. Evaluating an external recommender — whether in R, Python, or MatLab, involves three primary steps:

Writing the recommender. We will need a simple recommender written in language other than Java (Python in this case) that can take test data to build up a simple model and generate recommendations for a given list of test users.
Setting up a shim class. We will need to write a small class that teaches LensKit how to use our external algorithm.
Setting up LensKit evaluation. Finally we show how we setup an experiment using the shim class in a LensKit eval script to evaluate the external recommender.

Note, that the data we will use to test this recommender is a MovieLens rating dataset. The data consists of movie ratings with each row being <userId,itemId,rating>. You can read more about the dataset here.

Step 1: Install LensKit

You will need a copy of LensKit. This tutorial requires features that are included in LensKit 2.1; you can download the 2.1-M4 milestone release from the LensKit downloads page. Download the binary distribution and unpack it somewhere on your hard drive.

Step 2: Create a groovy file (`eval.groovy`)

Before we move into details of external algorithms, let’s start with a groovy script to show a basic recommender evaluation experiment setup. Shown below is a very simple groovy file that a) sets up a five-fold evaluation using the MovieLens 100k ratings dataset; b) specifies the recommendation algorithm(s), in this case a basic item-user mean algorithm PersMean; and c) specifies metrics to evaluate the recommendations — in this case topNnDCGand RMSEPredictMetric

	import org.grouplens.lenskit.knn.item.*
	import org.grouplens.lenskit.transform.normalize.*
	import org.grouplens.lenskit.eval.metrics.topn.*;


	trainTest {
	dataset crossfold("ml-100k") {
	source csvfile("ml-100k/u.data"){ // relative (or absolute) path to the dataset. In the current format, it assumes that you have a folder named "ml-100k" containing the "u.data" file
	delimiter "\t"
	domain {
	minimum 1.0
	maximum 5.0
	precision 1.0
	}
	}
	}

	algorithm("PersMean") {
	bind ItemScorer to UserMeanItemScorer
	bind (UserMeanBaseline, ItemScorer) to ItemMeanRatingItemScorer
	}


	metric RMSEPredictMetric

	metric topNnDCG {
	listSize 10
	candidates ItemSelectors.allItems()
	exclude ItemSelectors.trainingItems()
	}

	output "eval-results.csv"
	}

view raw

eval.groovy

hosted with ❤ by GitHub

Note: For more details on the LensKit evaluator and the configurations you can put in eval.groovy, see the evaluator manual page.

To execute the script, save this file and navigate to the directory containing the file using terminal (or command prompt). From the same directory execute the command lenskit eval and wait for LensKit related logs with “Build Successful” logged at the end of execution signifying the successful evaluation. The results of the experiment are written to eval-results.csv. This step will help you understand the basics and at the same time allowing us to easily demonstrate the changes required for an external algorithm.

Step 3: Create a simple Recommender (`item_mean.py`):

Let’s create a simple Python code that can generate item recommendations for a list of users. The code below shows a simple recommender that calculates each item’s mean rating normalized/offset by global item mean. You can note that in this case each user will have same list of items and with same predicted ratings; similar to item-mean recommender in LensKit.

	import sys


	class ItemMeanData(object):
	def __init__(self):
	self.global_sum = 0
	self.global_count = 0
	self.item_sums = {}
	self.item_counts = {}

	def train(self, trainfile):
	with open(trainfile) as f:
	for line in f:
	user, item, rating = line.strip().split(',')[:3]
	item = int(item)
	rating = float(rating)
	self.global_sum += rating
	self.global_count += 1
	if item not in self.item_sums:
	self.item_sums[item] = rating
	self.item_counts[item] = 1
	else:
	self.item_sums[item] += rating
	self.item_counts[item] += 1

	def global_mean(self):
	return self.global_sum / self.global_count

	def item_set(self):
	return set(self.item_counts.iterkeys())

	def item_mean_offsets(self):
	means = {}
	gmean = self.global_mean()
	for item, n in self.item_counts.iteritems():
	means[item] = self.item_sums[item] / n – gmean
	return gmean, means

	def score_items(self, to_score, output):
	global_mean, item_means = self.item_mean_offsets()
	for user, items in to_score.iteritems():
	for item in items:
	pred = global_mean
	if item in item_means:
	pred += item_means[item]
	print >> output, "%s,%s,%.3f" % (user, item, pred)



	def load_query_users(userfile, items):
	to_score = {}
	with open(userfile) as userf:
	for line in userf:
	user = int(line.strip())
	to_score[user] = items
	return to_score


	#Read the command line arguments

	if sys.argv[1]== '–for-users':
	trainfile, userfile = sys.argv[2:4]
	else:
	print >> sys.stderr, "Invalid Arguments."
	sys.exit(1)

	#Trains the model using the training file
	model = ItemMeanData()
	model.train(trainfile)

	if userfile is not None:
	to_score = load_query_users(userfile, model.item_set())
	model.score_items(to_score, sys.stdout)
	else:
	print >> sys.stderr, "no user file specified"
	sys.exit(1)

view raw

item_mean.py

hosted with ❤ by GitHub

Important point to observe — the code requires following two arguments:

training file ( consisting of <userId,itemId,rating>): The training set from which a simple model of recommendation is built up.
users file (<userId>) : List of user Ids for which recommendations are to be generated

Step 4: Update Groovy file : Create and configure the shim class

Now that we have an evaluation script and an external recommender, what we need is an agent to bind them together. The core of a typical LensKit recommender is the ItemScorer, computing individual item scores (typically rating predictions) that are then used for prediction and recommendation. We will use our Python script to pre-compute item scores (rating predictions) that will then be consumed by LensKit for the rest of the process.

To enable this, LensKit provides a PrecomputedItemScorer, an item scorer that just has an in-memory copy of fixed item scores. The ExternalProcessItemScorerBuilder utility class constructs a precomputed item scorer by running an external program — in this case, the Python script — to compute the scores, reading them from the program’s standard output and storing them in the precomputed item scorer.The evaluator needs to know how to run this program, and therefore, we need a simple class that implements Provider<ItemScorer> by using the builder to build a precomputed item scorer using our Python code. The class will set up the command line arugments needed by the program and instruct the item scorer builder to collect its output. For convenience, we will put this class in eval.groovy; here is the full script with our class and a new algorithm block that hooks it in to the evaluator:

	import org.grouplens.lenskit.knn.item.*
	import org.grouplens.lenskit.baseline.*
	import org.grouplens.lenskit.transform.normalize.*
	import org.grouplens.lenskit.eval.metrics.topn.*;
	import org.grouplens.lenskit.ItemScorer
	import org.grouplens.lenskit.baseline.ItemMeanRatingItemScorer
	import org.grouplens.lenskit.core.Transient
	import org.grouplens.lenskit.data.dao.EventDAO
	import org.grouplens.lenskit.data.dao.UserDAO
	import org.grouplens.lenskit.eval.data.traintest.QueryData
	import org.grouplens.lenskit.eval.metrics.predict.*
	import org.grouplens.lenskit.external.ExternalProcessItemScorerBuilder

	import javax.inject.Inject
	import javax.inject.Provider

	/**
	* Shim class to run item-mean.py to build an ItemScorer.
	*/

	class ExternalItemMeanScorerBuilder implements Provider<ItemScorer>{
	EventDAO eventDAO
	UserDAO userDAO

	@Inject
	public ExternalItemMeanScorerBuilder(@Transient EventDAO events,
	@Transient @QueryData UserDAO users) {
	eventDAO = events
	userDAO = users
	}

	@Override
	ItemScorer get() {
	def wrk = new File("external-scratch")
	wrk.mkdirs()
	def builder = new ExternalProcessItemScorerBuilder()
	// Note: don't use file names because it will interact badly with crossfolding
	return builder.setWorkingDir(wrk)
	.setExecutable("python") //can be "R", "matlab", "ruby" etc
	.addArgument("../item_mean.py") //relative (or absolute) location of sample recommender
	.addArgument("–for-users")
	.addRatingFileArgument(eventDAO)
	.addUserFileArgument(userDAO)
	.build()
	}
	}

	trainTest {
	dataset crossfold("ml-100k") {
	source csvfile("ml-100k/u.data") { //relative (or absolute) path to the dataset
	delimiter "\t"
	domain {
	minimum 1.0
	maximum 5.0
	precision 1.0
	}
	}
	}

	algorithm("PersMean") {
	bind ItemScorer to UserMeanItemScorer
	bind (UserMeanBaseline, ItemScorer) to ItemMeanRatingItemScorer
	}

	algorithm("ExternalAlgorithm") {
	bind ItemScorer toProvider ExternalItemMeanScorerBuilder
	}



	metric RMSEPredictMetric
	metric topNnDCG {
	listSize 10
	candidates ItemSelectors.allItems()
	exclude ItemSelectors.trainingItems()
	}

	output "eval-results.csv"
	}

view raw

eval_final.groovy

hosted with ❤ by GitHub

Lets have a look at some important code sections of the shim class:

```
builder.setExecutable("python)
```
specifies executable for the language that your code is written in, here it is Python. It can be Ruby, R, Matlab or any other language.
```
builder.addRatingFileArgument(eventDAO)
```
specifies the training file generated by LensKit (crossfolds)
```
builder.addUserFileArgument(userDAO)
```
specifies the Users file generated by LensKit to evaluate the recommendations
```
builder.addArgument("___") --NOT SHOWN ABOVE
```
You can pass any more arguments that you may require for your code

Notice the algorithm("ExternalAlgorithm") being added in groovy file. Earlier in Step 2, we included only one algorithm to evaluate i.e. PersMean. In this step we include the external algorithm and bind the ItemScorer to the new shim class described above.

Step 5: Finally, run LensKit

To execute the groovy file, same as we followed in Step 2, navigate to the directory containing the groovy file (eval.groovy) from terminal (or command prompt) and run lenskit eval again.

Evaluating an External Recommender written in Python using LensKit

Step 1: Install LensKit

Step 2: Create a groovy file (`eval.groovy`)

Step 3: Create a simple Recommender (`item_mean.py`):

Step 4: Update Groovy file : Create and configure the shim class

Step 5: Finally, run LensKit

Read More from GroupLens Research

Site Links

Step 1: Install LensKit

Step 2: Create a groovy file (eval.groovy)

Step 3: Create a simple Recommender (item_mean.py):

Step 4: Update Groovy file : Create and configure the shim class

Step 5: Finally, run LensKit

Read More from GroupLens Research

Site Links

Step 2: Create a groovy file (`eval.groovy`)

Step 3: Create a simple Recommender (`item_mean.py`):