YouTube Popular Videos Analysis 📊

This project tries to extract insights and patterns of YouTube's current most popular videos of a specific region (Country; here INDIA). Over 20 important attributes of each video is analyzed using Pandas, NumPy, etc. and insights are presented in vizualizations using Matplotlib and Seaborn.

This project starts with understanding the resources, methods, request parameters, structure of requested data, etc. for YouTube Data API v3. Then, for robust analysis, there is a need of Database to store data, collected at different timestamps over a long period (like 1 month or more). Here, I have explored the opportunity of using a Cloud Database, levaraging the benefits of Google Cloud Platform ( using its free-tier/always-free products only!! ). I utilized their Compute Engine as virtual machine to install and set-up the database (NOTE: The Cloud SQL of GCP is not included in the free tier).

This analysis may help anyone strategize their YouTube journey by understanding user preferences, current trends, improvement scopes etc.

Outline

Some Vizualizations
Required Pyhton libraries and modules
Setting-up Compute Engine on Google Cloud Platform
Setting-up PostgreSQL on Compute Engine
Creating Firewall rule for VM
Enabling YouTube Data API v3
Understanding YouTube Data API v3
Storing all credentials as Environment Variables
Data Transformation for efficient memory usage
Method used to efficiently load data into Database
Interacting with Database
Closing Database Connections
Transformed/Generated Columns
Inference, Hypothesis, Validation from Analysis
Abbreviations Used
Acknowledgement

Some Vizualizations

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.env_example		.env_example
.gitignore		.gitignore
README.md		README.md
YouTube_Popular_Videos_Analysis_1.ipynb		YouTube_Popular_Videos_Analysis_1.ipynb
YouTube_Popular_Videos_Analysis_2.ipynb		YouTube_Popular_Videos_Analysis_2.ipynb
benchmark-1.png		benchmark-1.png
correlation.png		correlation.png
data_collection.py		data_collection.py
duration_tag_counts.png		duration_tag_counts.png
peak_hour.png		peak_hour.png
popular_vid_type.png		popular_vid_type.png
youtube.png		youtube.png

Short Form	Meaning
doc	Documentation
enum	Enumerate (Categorical)
GCP	Google Cloud Platform
Shorts	YouTube Shorts
Stats	Statistics
VM	Virtual Machine
UTC	Coordinated Universal Time
`zxx`	No linguistic content, Not applicable

AniketMondal/DA_YouTube_Popular_Videos

Folders and files

Latest commit

History

Repository files navigation

YouTube Popular Videos Analysis 📊

Outline

Some Vizualizations

Required Pyhton libraries and modules

Setting-up Compute Engine on Google Cloud Platform

Setting-up PostgreSQL on Compute Engine

Creating Firewall rule for VM

Enabling YouTube Data API v3

Understanding YouTube Data API v3

Storing all credentials as Environment Variables

Data Transformation for efficient memory usage

Method used to efficiently load data into Database

Interacting with Database

Closing Database Connections

Transformed/Generated Columns

1. Rank

2. (Title/Audio)_Language_Name

3. Topics

4. Entry_Timestamp

5. video_type

6. duration_tag

7. local_publish_time

Inference, Hypothesis, Validation from Analysis

Abbreviations Used

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Languages