Custom Development

Recommender System Basics, Part 1

Martin Andrew Habich

Recommender Systems, also known as Recommendation Engines, are becoming more common with every passing day. As you are no doubt aware, there has been a relatively recent revival of data analysis techniques, spurned by the coming-of-age of "big data" and the associated machine learning techniques. As a result, the tool sets are getting better at an exciting rate, and the number of able and willing developers with data science skills is increasing.

In the post below, I've used bold to indicate terms-of-art which are commonly found in the literature about Recommender Systems.

What is a Recommender System?

Recommender Systems seek to answer one simple question - given a user's past behavior, and the behaviors of other users, which items is that user likely to be interested in? In this context, users are often customers and items are often products or information. Some of the most well-known examples of Recommender Systems include Netflix's suggestions on what to watch next, Amazon's product recommendations, and automatic music suggestion services such as Pandora.

The primary purposes of these systems are to increase user engagement, improve user experience, and to drive additional sales. While the figures vary dramatically, there has been consistent evidence to show that good recommendations are correlated to higher sales. The fact that Recommender Systems provide clear business value is one contributor to their popularity. Additionally, they are relatively easy to understand for both users and decision makers ("I get it...it's like Amazon!").

The Basic Function of a Recommender System

These systems require a user's feedback in order to generate recommendations - it's impossible to guess which items a brand new user might be interested in! To this end, any application making use of a Recommender System must gather feedback from its users. This feedback is often qualified as either explicit or implicit.

Explicit Feedback is information which the user provides consciously. An example would be asking a user to fill out a profile, checking boxes to indicate the kinds of topics that interest them. Another common tactic is to ask users to rate individual items, either on a scale or with a simple "thumbs up" or "thumbs down".

Implicit Feedback is information about the user's interactions with the application. Most commonly, this will include the user's purchase history (or interaction history in the case of something like Pandora). Some more sophisticated systems may also include additional implicit metrics such as page views, clicks, and other forms of navigation through a site.

Types of Recommendations

In the simplest view, there are two broad categories of algorithms for providing recommendations to users - Content-Based Filtering and Collaborative Filtering.

Content-Based Filtering algorithms attempt to recommend unpurchased items based on those items' similarity to items already purchased or rated by the user. In the example of Netflix, this would be like recommending "Violent Roman Political Dramas" because you watched Gladiator and rated it highly. Content-Based Filtering algorithms require a method for comparing the similarity of items to each other. This can be accomplished in any number of ways, from comparing metadata about the items (genre, publisher, price, etc.) to using sophisticated clustering techniques to produce machine-generated similarity weightings. Significantly, Content-Based Filtering does necessarily require any information about other application users.

Collaborative Filtering algorithms, in contrast, attempt to find users with similar profiles and recommend items based on those similar users' histories. This is akin to Amazon's "Customers Who Bought This Item Also Bought..." suggestions. Collaborative Filtering is sometimes described as having other users "vote" on which items should be recommended to your current user. As expected, there are a variety of algorithms for determining what it means for users to be "similar", many of which are provided out of the box by state-of-the-art machine learning and recommendation libraries.

Sounds Easy! What's the Catch?

There are many problems which must be addressed when implementing a Recommender System, including technology choices, architectural considerations, testing, security, and most importantly the quality of the results. Check out part 2 of this post for the basics of these topics!

Martin Andrew Habich
ABOUT THE AUTHOR

Delivery Development Manager