Skip to content

MANICHELLURII/ml-learning-hub-notebook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 

Repository files navigation

πŸŽ“ ML Learning Hub - Complete Notebooks Portfolio

A comprehensive, hands-on machine learning learning journey demonstrating fundamental concepts through practical implementations. Built as a structured portfolio showcasing progression from basic ML concepts to independent problem-solving.


πŸ“– Project Overview

This repository contains three progressively advanced machine learning notebook series, each building on previous concepts while demonstrating independent implementation skills. Written in first-person with reflective learning notes to show genuine understanding, not just code copying.

Created by: [Mani Chelluri]
Location: Secunderabad, India
Tech Stack: Python, Scikit-Learn, TensorFlow, Pandas, NumPy, Matplotlib
Environment: Google Colab


πŸ“š Notebooks Structure

Notebook 1: Classification Fundamentals

Location: 01_Classification/

What it covers:

  • Introduction to classification problems
  • Basic ML workflow: data β†’ model β†’ predictions
  • Binary and multi-class classification
  • Model evaluation metrics (accuracy, precision, recall)

Key Learning: Understanding what classification means and how to structure a basic ML problem.


Notebook 2: TensorFlow Regression Deep Dive

Location: 02_TF_Regression/

What it covers:

  • Regression vs classification
  • Building neural networks with TensorFlow/Keras
  • Loss functions and optimizers for continuous predictions
  • Model training, validation, and testing

Key Learning: Transitioning from traditional ML to deep learning frameworks for regression tasks.


Notebook 3: Scikit-Learn Complete ML Workflow ⭐

Location: 03_ScikitLearn_Workflow/

This is the most comprehensive section, demonstrating both theoretical understanding and practical implementation.

Part A: Theory - What Is Machine Learning?

πŸ“„ File: 03-Scikit-Learn-ML-Workflow.ipynb

What I learned:

  • Machine Learning Definition: Building models with tunable parameters that learn patterns from data and generalize to unseen examples (not hard-coding rules)
  • Supervised vs Unsupervised Learning: Do we have labeled training data?
  • Classification vs Regression: Discrete labels (classification) vs continuous values (regression)
  • Scikit-Learn API Consistency: The fit() β†’ predict() pattern that works across all algorithms
  • Complete ML Pipeline: Load data β†’ Split β†’ Train β†’ Evaluate β†’ Iterate

Source: Adapted from Jake VanderPlas' Python Data Science Handbook with my own learning annotations and reflections.

Why this matters: This notebook bridges my earlier classification work and TensorFlow experiments by explaining the underlying theory and standardized workflow patterns.


Part B: Implementation - Iris Classification Workflow (Sister Notebook)

πŸ“„ File: 04-Iris-Classification-Workflow-Sister.ipynb

Goal: Apply the exact same ML workflow independently on a fresh dataset to prove I can generalize beyond single examples.

Dataset: Iris Flowers Dataset

  • 150 samples
  • 4 numeric features (sepal length/width, petal length/width)
  • 3 species classes (setosa, versicolor, virginica)
  • Balanced dataset (50 samples per class)

Complete Workflow Implemented:

  1. πŸ“Š Load & Explore

    • Imported Iris data from sklearn.datasets
    • Created pandas DataFrame for exploration
    • Examined features, target distribution, and data quality
    • Verified balanced classes
  2. βœ‚οΈ Prepare Data

    • Train/test split: 80/20 (120 train, 30 test)
    • Used stratify=y to maintain class balance
    • Applied StandardScaler for feature normalization
    • Verified scaling: mean β‰ˆ 0, std β‰ˆ 1
  3. πŸ€– Baseline Model

    • Algorithm: Logistic Regression
    • Hyperparameters: random_state=42, max_iter=200
    • Trained on scaled features
  4. πŸ“ˆ Evaluate Performance

    • Training Accuracy: 95.83%
    • Test Accuracy: 93.33%
    • Precision/Recall/F1: All 0.90+ across classes
    • Confusion Matrix: Only 2 misclassifications (likely versicolor ↔ virginica confusion)
  5. πŸ’‘ Reflect & Learn

    • Documented 5 key takeaways in first-person
    • Identified next steps: try Random Forest, SVM, or messier datasets
    • Reflected on workflow importance over algorithm choice

Results Summary:

Training Accuracy: 95.83% (115/120 correct)
Test Accuracy:     93.33% (28/30 correct)

Classification Report:
              precision    recall  f1-score   support
    setosa       1.00      1.00      1.00        10
versicolor       0.90      0.90      0.90        10
 virginica       0.90      0.90      0.90        10

Confusion Matrix:
[[10  0  0]
 [ 0  9  1]
 [ 0  1  9]]

Code Quality:

  • βœ… All cells execute without errors
  • βœ… Clear comments explaining each step
  • βœ… Proper variable naming conventions
  • βœ… Markdown sections for readability
  • βœ… First-person learning reflections

🎯 What This Portfolio Demonstrates

To Recruiters & Hiring Managers:

  1. Theory + Practice Integration: I don't just run codeβ€”I understand WHY the workflow works and can explain it clearly.

  2. Generalization Ability: By implementing the same workflow on a different dataset (Iris), I prove I can apply ML patterns beyond tutorials.

  3. Professional Documentation: Clear markdown, structured sections, and reflective learning notes show communication skills.

  4. Problem-Solving Approach: Following Load β†’ Split β†’ Scale β†’ Train β†’ Evaluate β†’ Iterate shows systematic thinking.

  5. Code Quality: Error-free execution, proper comments, and reproducible results demonstrate attention to detail.

  6. Growth Mindset: "What I Learned" sections show metacognition and continuous improvement mentality.


πŸ”§ Tech Stack & Tools

Languages:

  • Python 3

ML Frameworks:

  • Scikit-Learn (classical ML)
  • TensorFlow/Keras (deep learning)

Data Processing:

  • Pandas (DataFrames, data manipulation)
  • NumPy (numerical operations)

Visualization:

  • Matplotlib (plotting)

Environment:

  • Google Colab (cloud notebooks)
  • Jupyter Notebook (local development)

Version Control:

  • Git/GitHub

πŸ“‚ Repository Structure

ml-learning-hub-notebooks/
β”‚
β”œβ”€β”€ README.md                          # This file
β”‚
β”œβ”€β”€ 01_Classification/
β”‚   └── (classification intro notebook)
β”‚
β”œβ”€β”€ 02_TF_Regression/
β”‚   └── (TensorFlow regression notebook)
β”‚
└── 03_ScikitLearn_Workflow/          # ⭐ Main Portfolio Piece
    β”œβ”€β”€ 03-Scikit-Learn-ML-Workflow.ipynb      # Theory + Framework
    └── 04-Iris-Classification-Workflow-Sister.ipynb  # Implementation

πŸš€ How to Use These Notebooks

Option 1: View on GitHub

  • Click on any .ipynb file above
  • GitHub renders notebooks directly in browser

Option 2: Run in Google Colab

  1. Click the notebook file
  2. Click "Open in Colab" button at the top
  3. Run cells sequentially from top to bottom

Option 3: Run Locally

# Clone repository
git clone https://github.com/[your-username]/ml-learning-hub-notebooks.git

# Navigate to directory
cd ml-learning-hub-notebooks

# Install dependencies
pip install numpy pandas scikit-learn matplotlib tensorflow

# Open Jupyter
jupyter notebook

# Navigate to desired notebook and run

πŸ“Š Key Metrics & Achievements

Notebook Task Algorithm Accuracy Key Insight
Notebook 1 Classification (Various) - Basic ML workflow
Notebook 2 Regression Neural Network - Deep learning for continuous values
Notebook 3A Theory - - Understanding ML fundamentals
Notebook 3B Classification Logistic Regression 93.33% Independent implementation

πŸŽ“ Key Learnings

1. The Workflow Pattern is More Important Than Algorithms

Once I understood Load β†’ Split β†’ Train β†’ Evaluate β†’ Iterate, I could apply it to any dataset. The algorithm choice (Logistic Regression vs Random Forest vs SVM) matters less than following a systematic process.

2. Scikit-Learn's Consistent API is Powerful

Every model follows fit() and predict(). This consistency means I can swap algorithms in 1 line of code and focus on data quality and evaluation.

3. Feature Scaling Matters

StandardScaler improved my model performance by normalizing features. Algorithms like Logistic Regression are sensitive to feature magnitude.

4. Train/Test Split Prevents Overfitting

By using stratify=y, I ensured balanced classes in both sets. Testing on held-out data gives honest performance estimates.

5. High Accuracy β‰  Perfect Model

Even with 93.33% accuracy, I had 2 misclassifications. The confusion matrix showed versicolor/virginica confusionβ€”they're genuinely more similar species.

6. Documentation Shows Understanding

Writing "What I Learned" sections forced me to articulate concepts clearly. If I can explain it, I truly understand it.


πŸ”œ Next Steps & Future Work

Short-term (Next 2 weeks):

  • Try Random Forest and SVM on Iris dataset
  • Compare model performance systematically
  • Implement cross-validation for more robust evaluation
  • Add hyperparameter tuning examples

Medium-term (Next month):

  • Apply workflow to messier, real-world datasets (missing data, imbalanced classes)
  • Build classification project on Kaggle dataset
  • Add ensemble methods (voting, stacking)
  • Document feature engineering techniques

Long-term (3-6 months):

  • Complete end-to-end ML projects (data collection β†’ deployment)
  • Build web app for model deployment
  • Contribute to open-source ML projects
  • Write technical blog posts explaining concepts

πŸ’Ό About Me

Name: [Mani Chelluri]
Location: Secunderabad, Telangana, India
Interests: AI/ML, Cloud Computing, Open Source, Startups
Currently: Building ML skills through structured learning and practical projects

Connect with me:


πŸ“œ License

This project is open source and available under the MIT License.


πŸ™ Acknowledgments

  • Jake VanderPlas - Python Data Science Handbook for ML fundamentals
  • Scikit-Learn Documentation - Comprehensive guides and examples
  • Google Colab - Free cloud computing for ML experiments
  • UCI ML Repository - Iris dataset and countless others

⭐ If This Helped You

If you found this learning journey helpful:

  • ⭐ Star this repository
  • 🍴 Fork it and create your own learning hub
  • πŸ“’ Share it with others learning ML
  • πŸ’¬ Open an issue with feedback or questions

Built with πŸ’» and β˜• in Secunderabad, India


Last Updated: February 7, 2026

About

My structured journey learning machine learning from fundamentals to practical implementations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors