Exploratory Dataset Analysis (EDA) will be uploaded to this repository. Libraries such as Pandas, Matplotlib, Seaborn and Plotly will be used for data analysis.
-
Updated
Apr 2, 2024 - Jupyter Notebook
Exploratory Dataset Analysis (EDA) will be uploaded to this repository. Libraries such as Pandas, Matplotlib, Seaborn and Plotly will be used for data analysis.
Analytics data for looking : filter movies with drama genre, most rated movies, number of users and average rating for each age range,etc. Visualize the count and age of moviegoers with the Matplolib library and Show movies, age range, average rating.
This is one of my final projects for the HarvardX Data Science Professional Certificate Program. As the title suggests, it is on the GroupLense database colloquially known as MovieLens. The goal of the project is to predict ratings with a RMSE below .86490. I was able to surpass the goal with 3 different models. Happy reading!
This is a project made as a part of my data science master's program to analyze and draw inference from Movielens data.
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
MovieLens Dataset analysis using Hadoop and Pyspark
Data analysis and movie recommendation of OpenMovie dataset by using the shell, Python, Cosine Similarity algorithm, Apache PySpark, and Apache Hadoop.
A recommendation algorithm capable of accurately predicting how a user will rate a movie they have not yet viewed based on their historical preferences. The models and EDA are based on the 1M MOVIELENS dataset
This repository contains analysis of IMDB data from multiple sources and analysis of movies/cast/box office revenues, movie brands and franchises
Implementation of Spotify's Generalist-Specialist score on the MovieLens dataset.
Data analysis on Big Data. Used various databases from 1M to 100M including Movie Lens dataset to perform analysis. Covers basics and advance map reduce using MongoDB.
Data analysis on Big Data. Used various databases from 1M to 100M including Movie Lens dataset to perform analysis. Covers basics and advance map reduce using Hadoop.
Project to determine the ratings for a movie using each of the Spark & Hadoop Eco-system.
Contains my custom implementation of various machine learning models and analysis.
Building a movie recommender system with factorization machines on Amazon SageMaker.
Analysis of MovieLens Dataset in Python
Created visualizations of the MovieLens data set using matrix factorization http://www.yisongyue.com/courses/cs155/2018_winter/assignments/project2.pdf
Movie recommendation system based on Collaborative filtering using Apache Spark
Add a description, image, and links to the movielens-data-analysis topic page so that developers can more easily learn about it.
To associate your repository with the movielens-data-analysis topic, visit your repo's landing page and select "manage topics."