Building Your Own Recommendation Engine
This tutorial explains how to build a recommendation engine for your self using Ruby on Rails.
The main idea is about collecting data about everything:
For example for a video site, the data would be:
- Who uploaded a video?
- Who commented on a video?
- Which tags where created?
- Who visited the video? (also tracking anonymous visitors)
- Who favorited a video?
- Who rated a video?
- Which channels was the video assigned to?
- Text streams of title, description, tags, channels and comments are collected by a fulltext indexer which puts weight on each of the data sources.
Normally, the way we can do recommendation is :
- Find similarity by fulltext search on title
- Find similarity by fulltext search on description
- Find similarity by fulltext search on comments
- Find similarity by fulltext search on tags fulltext
- Similar pages where the same user has done activity (like rating, commenting)
- Other pages with the same tags (weighted by "expressiveness" of tags).
- Other pages the users from these favorites also made favorites.
- Other pages the raters from these ratings also rated on (weighted)
- Other pages browsed by people who browsed this page.
Thus, we would create functions which return lists of (id,weight) tuples for each of the points. Some only consider a limited amount of pages(eg last 50), some modify the weight by eg rating, tag count (more often tagged = less expressive).
All these will be combined into a single list by just summing up the weights by page ids, then sorted by weight. All this process is run on Cron and has to be updated frequently.