The Yelp.me project aims to provide personalized venue and service recommendations to help people break out of their routine and explore new options. Often, individuals tend to revisit the same venues and miss out on unique experiences available around them. Yelp.me addresses this by offering tailored suggestions, encouraging users to "Discover More 💟 "
Dataset The project uses the Yelp dataset, sourced directly from Yelp's official website (Yelp Dataset). It includes six JSON files with the following data:
- Business Table: 150,000 business records, 14 columns
- Review Table: 6,500,000 records, 9 columns
- Tips Table: 1,200,000 records, 4 columns
- Check-in Table: 150,000 records, 3 columns
- User Table: 2,000,000 records, 22 columns
- Photo Table: 200,000 records, 5 columns
Project Purpose and Functionality
The goal of Yelp.me is to create a recommendation system that provides users with venue suggestions tailored to their preferences. In addition to offering personalized recommendations, the project employs machine learning techniques to predict the likelihood of a user enjoying a specific restaurant.
Key functionalities include:
- User and Venue Segmentation: Users and venues are segmented according to specified criteria.
- NLP-Based Analysis: Natural Language Processing (NLP) is applied to analyze reviews and tips, organizing them meaningfully.
- Recommendation Environment: Yelp.me fosters an environment where users can explore new venues and services.
This system ultimately aims to help users discover new experiences aligned with their tastes.
- Project Management: Project tracking and management were done using Notion.
- Data Storage and Sharing: Data was stored and shared via Google Drive.
- Python Development Environment: All Python code was developed on Google Colab.
- Web Application: The web application was built using Streamlit.
- Presentation Creation: The project presentation was created using Canva with AI support.
- Problem Solving and NLP: ChatGPT was used for assistance in problem-solving and Natural Language Processing (NLP) tasks.
- Business and User Segmentation: Segmentation of businesses and users into 10 distinct groups each.
- Venue Categorization with NLP: NLP-based categorization of venues using ChatGPT assistance.
- Hybrid Sorting Score for Businesses: Businesses are sorted using a hybrid score that combines Bayesian and weighted sorting methods.
- Business Review Sorting: Organizing business reviews based on relevance and importance.
- User-Based Recommendation System: A recommendation system tailored to individual user preferences.
- Machine Learning (XGBoost Model): Utilization of the XGBoost model for predictive analytics.
- Sentiment Analysis on Tips: Sentiment analysis of comments in the tips dataset to gauge user sentiment.
Aybüke Çilingir
Furkan Karakuz
Güldeniz Güzelay