I am having the students in my introduction to programming class work with the Netflix recommendations data for their final project this term, so it was timely that the New York Times recently did an article reporting on the progress that has been made on the Netflix Prize over the past two years. Nobody has made the 10% jump yet, and while teams have managed over a 9% improvement the improvements are getting incrementally smaller.
The fun part of the article, though, is the details about what it is making it hard to get that last 1%. One’s rating for “Napolean Dynamite” is apparently very hard to predict based on one’s ratings of other movies. In general, there are a very small pool of movies that make up a large portion of the remaining error rate – based on the analysis of one of the people working on the competition at least. There is a lot of good math being used here, but the article does a nice job of talking about how insights about the psychology of preferences informs the statistics used. For example, given the fact that a viewer may rate a movie and then if asked to re-rate it a month later change their rating by on average by 0.4 (out of 5) stars, some people set their algorithms to discount older ratings as compared to recent ones. I particularly liked the fact that they are trying to figure out when to stop recommending a television series – something I wish that the Amazon recommendation system could figure out.
The people who are working on the competition are the other interesting part – this really is capturing the basement-hacker spirit. People from all around the world, with a wide range of background are working on this problem. It is cool that a number of teams reported having their junior high or high school aged kids helping them with the problems – whether brainstorming ideas or helping with the math. I’m rooting for one of these amateur enthusiasts to make that final breakthrough on the problem. From the little I have played with the data I can confirm that while getting good results would require a great deal of effort, if you are casually interested a little programming background and an evening of free time is enough to at least get a glimpse of the intricacies of the problem. Even the incredibly unsophisticated approach I have my students working with returns results that are more plausible than randomness.