In this article, we show how to use LensKit to evaluate a recommender written in Python.  We wrote this article to help people who want to use LensKit’s built-in evaluation capabilities and comparison algorithms, but don’t want to implement their own algorithms in Java.  Evaluating an external recommender — whether in R, Python, or MatLab, involves three primary steps:

  • Writing the recommender. We will need a simple recommender written in language other than Java (Python in this case) that can take test data to build up a simple model and generate recommendations for a given list of test users.
  • Setting up a shim class. We will need to write a small class that teaches LensKit how to use our external algorithm.
  • Setting up LensKit evaluation. Finally we show how we setup an experiment using the shim class in a LensKit eval script to evaluate the external recommender.

Note, that the data we will use to test this recommender is a MovieLens rating dataset. The data consists of movie ratings with each row being <userId,itemId,rating>. You can read more about the dataset here.

Step 1: Install LensKit

You will need a copy of LensKit. This tutorial requires features that are included in LensKit 2.1; you can download the 2.1-M4 milestone release from the LensKit downloads page. Download the binary distribution and unpack it somewhere on your hard drive.

Step 2: Create a groovy file (eval.groovy)

Before we move into details of external algorithms, let’s start with a groovy script to show a basic recommender evaluation experiment setup. Shown below is a very simple groovy file that a) sets up a five-fold evaluation using the MovieLens 100k ratings dataset; b) specifies the recommendation algorithm(s), in this case a basic item-user mean algorithm PersMean; and c) specifies metrics to evaluate the recommendations — in this case  topNnDCGand RMSEPredictMetric


import org.grouplens.lenskit.knn.item.*
import org.grouplens.lenskit.transform.normalize.*
import org.grouplens.lenskit.eval.metrics.topn.*;
trainTest {
dataset crossfold("ml-100k") {
source csvfile("ml-100k/u.data"){ // relative (or absolute) path to the dataset. In the current format, it assumes that you have a folder named "ml-100k" containing the "u.data" file
delimiter "\t"
domain {
minimum 1.0
maximum 5.0
precision 1.0
}
}
}
algorithm("PersMean") {
bind ItemScorer to UserMeanItemScorer
bind (UserMeanBaseline, ItemScorer) to ItemMeanRatingItemScorer
}
metric RMSEPredictMetric
metric topNnDCG {
listSize 10
candidates ItemSelectors.allItems()
exclude ItemSelectors.trainingItems()
}
output "eval-results.csv"
}

view raw

eval.groovy

hosted with ❤ by GitHub

Note: For more details on the LensKit evaluator and the configurations you can put in eval.groovy, see the evaluator manual page.

To execute the script, save this file and navigate to the directory containing the file using terminal (or command prompt). From the same directory execute the command lenskit eval and wait for LensKit related logs with “Build Successful” logged at the end of execution signifying the successful evaluation. The results of the experiment are written to eval-results.csv. This step will help you understand the basics and at the same time allowing us to easily demonstrate the changes required for an external algorithm.

Step 3: Create a simple Recommender  (item_mean.py):

Let’s create a simple Python code that can generate item recommendations for a list of users. The code below shows a simple recommender that calculates each item’s mean rating normalized/offset by global item mean. You can note that in this case each user will have same list of items and with same predicted ratings; similar to item-mean recommender in LensKit.


import sys
class ItemMeanData(object):
def __init__(self):
self.global_sum = 0
self.global_count = 0
self.item_sums = {}
self.item_counts = {}
def train(self, trainfile):
with open(trainfile) as f:
for line in f:
user, item, rating = line.strip().split(',')[:3]
item = int(item)
rating = float(rating)
self.global_sum += rating
self.global_count += 1
if item not in self.item_sums:
self.item_sums[item] = rating
self.item_counts[item] = 1
else:
self.item_sums[item] += rating
self.item_counts[item] += 1
def global_mean(self):
return self.global_sum / self.global_count
def item_set(self):
return set(self.item_counts.iterkeys())
def item_mean_offsets(self):
means = {}
gmean = self.global_mean()
for item, n in self.item_counts.iteritems():
means[item] = self.item_sums[item] / n – gmean
return gmean, means
def score_items(self, to_score, output):
global_mean, item_means = self.item_mean_offsets()
for user, items in to_score.iteritems():
for item in items:
pred = global_mean
if item in item_means:
pred += item_means[item]
print >> output, "%s,%s,%.3f" % (user, item, pred)
def load_query_users(userfile, items):
to_score = {}
with open(userfile) as userf:
for line in userf:
user = int(line.strip())
to_score[user] = items
return to_score
#Read the command line arguments
if sys.argv[1]== '–for-users':
trainfile, userfile = sys.argv[2:4]
else:
print >> sys.stderr, "Invalid Arguments."
sys.exit(1)
#Trains the model using the training file
model = ItemMeanData()
model.train(trainfile)
if userfile is not None:
to_score = load_query_users(userfile, model.item_set())
model.score_items(to_score, sys.stdout)
else:
print >> sys.stderr, "no user file specified"
sys.exit(1)

view raw

item_mean.py

hosted with ❤ by GitHub

Important point to observe — the code requires following two arguments:

  1. training file ( consisting of <userId,itemId,rating>): The training set from which a simple model of recommendation is built up.
  2. users file (<userId>: List of user Ids for which recommendations are to be generated

Step 4: Update Groovy file : Create and configure the shim class

Now that we have an evaluation script and an external recommender, what we need is an agent to bind them together. The core of a typical LensKit recommender is the ItemScorer, computing individual item scores (typically rating predictions) that are then used for prediction and recommendation. We will use our Python script to pre-compute item scores (rating predictions) that will then be consumed by LensKit for the rest of the process.

To enable this, LensKit provides a PrecomputedItemScorer, an item scorer that just has an in-memory copy of fixed item scores. The ExternalProcessItemScorerBuilder utility class constructs a precomputed item scorer by running an external program — in this case, the Python script — to compute the scores, reading them from the program’s standard output and storing them in the precomputed item scorer.The evaluator needs to know how to run this program, and therefore, we need a simple class that implements Provider<ItemScorer> by using the builder to build a precomputed item scorer using our Python code. The class will set up the command line arugments needed by the program and instruct the item scorer builder to collect its output. For convenience, we will put this class in eval.groovy; here is the full script with our class and a new algorithm block that hooks it in to the evaluator:


import org.grouplens.lenskit.knn.item.*
import org.grouplens.lenskit.baseline.*
import org.grouplens.lenskit.transform.normalize.*
import org.grouplens.lenskit.eval.metrics.topn.*;
import org.grouplens.lenskit.ItemScorer
import org.grouplens.lenskit.baseline.ItemMeanRatingItemScorer
import org.grouplens.lenskit.core.Transient
import org.grouplens.lenskit.data.dao.EventDAO
import org.grouplens.lenskit.data.dao.UserDAO
import org.grouplens.lenskit.eval.data.traintest.QueryData
import org.grouplens.lenskit.eval.metrics.predict.*
import org.grouplens.lenskit.external.ExternalProcessItemScorerBuilder
import javax.inject.Inject
import javax.inject.Provider
/**
* Shim class to run item-mean.py to build an ItemScorer.
*/
class ExternalItemMeanScorerBuilder implements Provider<ItemScorer>{
EventDAO eventDAO
UserDAO userDAO
@Inject
public ExternalItemMeanScorerBuilder(@Transient EventDAO events,
@Transient @QueryData UserDAO users) {
eventDAO = events
userDAO = users
}
@Override
ItemScorer get() {
def wrk = new File("external-scratch")
wrk.mkdirs()
def builder = new ExternalProcessItemScorerBuilder()
// Note: don't use file names because it will interact badly with crossfolding
return builder.setWorkingDir(wrk)
.setExecutable("python") //can be "R", "matlab", "ruby" etc
.addArgument("../item_mean.py") //relative (or absolute) location of sample recommender
.addArgument("–for-users")
.addRatingFileArgument(eventDAO)
.addUserFileArgument(userDAO)
.build()
}
}
trainTest {
dataset crossfold("ml-100k") {
source csvfile("ml-100k/u.data") { //relative (or absolute) path to the dataset
delimiter "\t"
domain {
minimum 1.0
maximum 5.0
precision 1.0
}
}
}
algorithm("PersMean") {
bind ItemScorer to UserMeanItemScorer
bind (UserMeanBaseline, ItemScorer) to ItemMeanRatingItemScorer
}
algorithm("ExternalAlgorithm") {
bind ItemScorer toProvider ExternalItemMeanScorerBuilder
}
metric RMSEPredictMetric
metric topNnDCG {
listSize 10
candidates ItemSelectors.allItems()
exclude ItemSelectors.trainingItems()
}
output "eval-results.csv"
}

Lets have a look at some important code sections of the shim class:

  1. builder.setExecutable("python)

    specifies executable for the language that your code is written in, here it is Python. It can be Ruby, R, Matlab or any other language.

  2. builder.addRatingFileArgument(eventDAO)

    specifies the training file generated by LensKit (crossfolds)

  3. builder.addUserFileArgument(userDAO)

    specifies the Users file generated by LensKit to evaluate the recommendations

  4. builder.addArgument("___") --NOT SHOWN ABOVE

    You can pass any more arguments that you may require for your code

 

Notice the algorithm("ExternalAlgorithm") being added in groovy file. Earlier in Step 2, we included only one algorithm to evaluate i.e. PersMean. In this step we include the external algorithm and bind the ItemScorer to the new shim class described above.

Step 5: Finally, run LensKit

To execute the groovy file, same as we followed in Step 2, navigate to the directory containing the groovy file (eval.groovy) from terminal (or command prompt) and run lenskit eval again.

Written by

PhD Student


Comments are closed.