Recommendation System (Content Based)

In this python project, I am trying to build a movie recommendation system based on user interactions. For example,

previously watched movies,
user search query

I will use movie data csv from url https://query.data.world/s/uikepcpffyo2nhig52xxeevdialfl7

I followed following steps to implement this project.

Load data from above url.
Selected relevant columns from the dataset
Prepared text content as suitable for applying filtering algorithms
Extracted Keywords for each record
Created a Bag of words
Dropped all other irrelevant columns.
Generate CountVectorizer()
Generate Cosine Similarity Matrix
Implemented recommender function

Finally, I filter top 10 movie suggestions based on a user search query. (i.e. movie title)

Installation

I have used rake_nltk for this project.

  !pip install rake_nltk

🏆 Lessons Learned

Cosine Similarity
Document Term Frequency
Bag of Words
Basic text pre-processing for NLP
Basic usage of rake_nltk and Rake() class
CountVectorizer() to convert Bag of words to numeric data

Demo

Try it on my profile