Metrotwin Recommends

We’ve been using our new Acts As Recommendable plugin on metrotwin.com and it’s been interesting to see how it’s performing in a real-world situation.

Bookmarks (places) are integral to Metrotwin, and a user can associate themselves with a bookmark by ‘Loving it’, saving it to their profile, or by stating they’ve been there.

So there was potentially a lot of information that could be collected about users preferences from their association with bookmarks. And that information could then be used to improve the overall experience, such as recommending bookmarks to people, and showing similar bookmarks - a great example of a practical application to Collective Intelligence.

The screenshot shows Acts As Recommendable in action - displaying a list of tailored recommendations on Metrotwin. What you’re seeing is basically ‘people who are associated with some of your bookmarks (AKA similar users) are also associated with the following bookmarks’.

Metrotwin Recommends is tailored to the specific individual and where we don’t have an recommendation data for that user we show a generic list of the top 5 bookmarks.

We had to do a lot of tuning to Acts As Recommendable to make sure it would scale to the amount of data required. Most of which revolved around two things, the amount of memory consumed and the speed that the dataset was generated at. We found that ActiveRecord was too memory intensive to build the initial user/bookmark matrix (it would crash Ruby!) so we used raw SQL to build an array of integers. We then found Ruby too slow to perform the pearson calculation needed - so I rewrote that in C, calling it from Ruby, which sped up things up considerably.

We can’t generate the recommendations on the fly - so we generate a very large similarity dataset of all the bookmarks offline, once a day.

This similarity matrix greatly reduces the amount of calculations and SQL queries we have to make at run time, without which the whole process wouldn’t be viable. Each bookmark has a row in the dataset where it is compared to every other bookmark, and this row is stored in memcached (so your web servers can share the memory, without having to generate the dataset for every mongrel).

I’ve also been testing the plugin on a real set of users/movies, where the recommendations are perhaps more clear.
The list below are the similar movies to the film ‘Terminator (1984)’ as calculated by our algorithm.

  • Terminator 2: Judgment Day (1991)
  • Raiders of the Lost Ark (1981)
  • Empire Strikes Back, The (1980)
  • Alien (1979)
  • Aliens (1986)
  • True Lies (1994)
  • Jurassic Park (1993)
  • Indiana Jones and the Last Crusade (1989)
  • Die Hard (1988)
  • Star Trek: The Wrath of Khan (1982)

I think those results are pretty accurate. I’m also trying to get my hands on the Netflix Prize dataset, to see how the plugin responds to a much larger amount of data (and also to get some newer movies).

So this shows you don’t have to have the massive resources that a company like Amazon or Google have to deliver accurate and tailored recommendations to people - and this plugin provides a production tested solution that you can easily drop into an existing Rails application.