To prove that you don’t need “Netflix (or Amazon) scale” to achieve the benefits of using recommender systems, at Cogent, we built our own recommender. Here’s a run-down about how we did it. Warning, it gets pretty technical.
The data set
We used a MovieLens dataset made available by GroupLens containing 20 million ratings across 27k movies by 138k users, and a set of 11m movie tags across more than a thousand descriptive tags. After experimenting with the data and a range of techniques, we built 3 individual recommendation models and a hybrid ensemble model fed by their outputs, to demonstrate the different recommendation approaches.
The first model: matrix factorisation
Our first model uses matrix factorisation to develop predictions. Matrix factorisation is a collaborative filtering technique that involves taking the interactions between users with movies, in our case the ratings, and develops a condensed representation by learning implicit associations between users and movies. These associations can be thought of like connections, for example a torch and batteries would be more strongly connected by user behaviour than torches and dog food. The condensed representation contains a set of weightings that mathematically quantify the implicit associations, which can be used to predict how a user would rate a movie. These predictions can be compared to actual ratings (from a reserved test data set) to measure the accuracy of the model. The predictions can also be used to make recommendations for movies that haven’t been rated.
We constructed our matrix factorisation model in Python using Keras, a high-level neural networks API, wrapping TensorFlow, Google’s open source machine learning framework. We trained the model on the training data split using a GPU-equipped AWS EC2 instance for ~30 hours.
The second model: vector space
The second model uses the movie tags (things like “animation”, “aliens”) to build a vector space model. This content filtering approach generates a mathematical representation of the films that can be used to calculate distance, or similarity between films. If someone likes a bunch of horror films, this model would be able to find other thematically similar films to recommend. Since there were over a thousand tags in the dataset, we also used principal components analysis (PCA) to reduce the dimensionality of the problem. This step occurs first and involves squashing the 1000+ tags into a reduced mathematical space (we chose 30 components) that retains the vast bulk of the information but is much easier to perform computations on.
The third model: Nearest User
The third model takes a set of movies and uses the existing dataset to find the user whose preferences best match that set. Then the model simply recommends other movies that user rated highly.
These three models have different strengths. We found the matrix factorisation model to be generally the most accurate at recommending films, based on some set of inputs, that a user was likely to rate highly. However, the model was prone to bias; some obscure movies that only had a small number of top ratings would appear too frequently in the dataset. Also, generally highly rated films were often recommended even if they bore no resemblance to provided inputs.
The vector space model successfully finds films that are similar to inputs, but doesn’t discriminate well on movie quality. The closest user model tended to suffer from similar problems, since it effectively reduces sample size to one.
The best results were achieved by our ensemble model. This model takes the films recommended by the vector space model and the closest user model, then selects the films from that subset that are rated most highly using the matrix factorisation model. This largely mitigated the problems of bias while still producing higher quality recommendations.
You might be keen to try out our prototype: Cogent AI Movie Match. If you’re interested in anything you’ve seen here and have questions about how to apply AI or machine learning to features or products effectively, then get in touch.
Find out more about AI at Cogent here.
 We used mean absolute error (MAE) – specifically distance from actual rating on a 5 star scale. Our model achieved a MAE of 0.69, meaning it could guess how a user would rate a movie within a margin of ±0.69 stars on average.