A comprehensive, hands-on machine learning learning journey demonstrating fundamental concepts through practical implementations. Built as a structured portfolio showcasing progression from basic ML concepts to independent problem-solving.
This repository contains three progressively advanced machine learning notebook series, each building on previous concepts while demonstrating independent implementation skills. Written in first-person with reflective learning notes to show genuine understanding, not just code copying.
Created by: [Mani Chelluri]
Location: Secunderabad, India
Tech Stack: Python, Scikit-Learn, TensorFlow, Pandas, NumPy, Matplotlib
Environment: Google Colab
Location: 01_Classification/
What it covers:
- Introduction to classification problems
- Basic ML workflow: data β model β predictions
- Binary and multi-class classification
- Model evaluation metrics (accuracy, precision, recall)
Key Learning: Understanding what classification means and how to structure a basic ML problem.
Location: 02_TF_Regression/
What it covers:
- Regression vs classification
- Building neural networks with TensorFlow/Keras
- Loss functions and optimizers for continuous predictions
- Model training, validation, and testing
Key Learning: Transitioning from traditional ML to deep learning frameworks for regression tasks.
Location: 03_ScikitLearn_Workflow/
This is the most comprehensive section, demonstrating both theoretical understanding and practical implementation.
π File: 03-Scikit-Learn-ML-Workflow.ipynb
What I learned:
- Machine Learning Definition: Building models with tunable parameters that learn patterns from data and generalize to unseen examples (not hard-coding rules)
- Supervised vs Unsupervised Learning: Do we have labeled training data?
- Classification vs Regression: Discrete labels (classification) vs continuous values (regression)
- Scikit-Learn API Consistency: The
fit()βpredict()pattern that works across all algorithms - Complete ML Pipeline: Load data β Split β Train β Evaluate β Iterate
Source: Adapted from Jake VanderPlas' Python Data Science Handbook with my own learning annotations and reflections.
Why this matters: This notebook bridges my earlier classification work and TensorFlow experiments by explaining the underlying theory and standardized workflow patterns.
π File: 04-Iris-Classification-Workflow-Sister.ipynb
Goal: Apply the exact same ML workflow independently on a fresh dataset to prove I can generalize beyond single examples.
Dataset: Iris Flowers Dataset
- 150 samples
- 4 numeric features (sepal length/width, petal length/width)
- 3 species classes (setosa, versicolor, virginica)
- Balanced dataset (50 samples per class)
Complete Workflow Implemented:
-
π Load & Explore
- Imported Iris data from
sklearn.datasets - Created pandas DataFrame for exploration
- Examined features, target distribution, and data quality
- Verified balanced classes
- Imported Iris data from
-
βοΈ Prepare Data
- Train/test split: 80/20 (120 train, 30 test)
- Used
stratify=yto maintain class balance - Applied StandardScaler for feature normalization
- Verified scaling: mean β 0, std β 1
-
π€ Baseline Model
- Algorithm: Logistic Regression
- Hyperparameters:
random_state=42,max_iter=200 - Trained on scaled features
-
π Evaluate Performance
- Training Accuracy: 95.83%
- Test Accuracy: 93.33%
- Precision/Recall/F1: All 0.90+ across classes
- Confusion Matrix: Only 2 misclassifications (likely versicolor β virginica confusion)
-
π‘ Reflect & Learn
- Documented 5 key takeaways in first-person
- Identified next steps: try Random Forest, SVM, or messier datasets
- Reflected on workflow importance over algorithm choice
Results Summary:
Training Accuracy: 95.83% (115/120 correct)
Test Accuracy: 93.33% (28/30 correct)
Classification Report:
precision recall f1-score support
setosa 1.00 1.00 1.00 10
versicolor 0.90 0.90 0.90 10
virginica 0.90 0.90 0.90 10
Confusion Matrix:
[[10 0 0]
[ 0 9 1]
[ 0 1 9]]
Code Quality:
- β All cells execute without errors
- β Clear comments explaining each step
- β Proper variable naming conventions
- β Markdown sections for readability
- β First-person learning reflections
-
Theory + Practice Integration: I don't just run codeβI understand WHY the workflow works and can explain it clearly.
-
Generalization Ability: By implementing the same workflow on a different dataset (Iris), I prove I can apply ML patterns beyond tutorials.
-
Professional Documentation: Clear markdown, structured sections, and reflective learning notes show communication skills.
-
Problem-Solving Approach: Following Load β Split β Scale β Train β Evaluate β Iterate shows systematic thinking.
-
Code Quality: Error-free execution, proper comments, and reproducible results demonstrate attention to detail.
-
Growth Mindset: "What I Learned" sections show metacognition and continuous improvement mentality.
Languages:
- Python 3
ML Frameworks:
- Scikit-Learn (classical ML)
- TensorFlow/Keras (deep learning)
Data Processing:
- Pandas (DataFrames, data manipulation)
- NumPy (numerical operations)
Visualization:
- Matplotlib (plotting)
Environment:
- Google Colab (cloud notebooks)
- Jupyter Notebook (local development)
Version Control:
- Git/GitHub
ml-learning-hub-notebooks/
β
βββ README.md # This file
β
βββ 01_Classification/
β βββ (classification intro notebook)
β
βββ 02_TF_Regression/
β βββ (TensorFlow regression notebook)
β
βββ 03_ScikitLearn_Workflow/ # β Main Portfolio Piece
βββ 03-Scikit-Learn-ML-Workflow.ipynb # Theory + Framework
βββ 04-Iris-Classification-Workflow-Sister.ipynb # Implementation
- Click on any
.ipynbfile above - GitHub renders notebooks directly in browser
- Click the notebook file
- Click "Open in Colab" button at the top
- Run cells sequentially from top to bottom
# Clone repository
git clone https://github.com/[your-username]/ml-learning-hub-notebooks.git
# Navigate to directory
cd ml-learning-hub-notebooks
# Install dependencies
pip install numpy pandas scikit-learn matplotlib tensorflow
# Open Jupyter
jupyter notebook
# Navigate to desired notebook and run| Notebook | Task | Algorithm | Accuracy | Key Insight |
|---|---|---|---|---|
| Notebook 1 | Classification | (Various) | - | Basic ML workflow |
| Notebook 2 | Regression | Neural Network | - | Deep learning for continuous values |
| Notebook 3A | Theory | - | - | Understanding ML fundamentals |
| Notebook 3B | Classification | Logistic Regression | 93.33% | Independent implementation |
Once I understood Load β Split β Train β Evaluate β Iterate, I could apply it to any dataset. The algorithm choice (Logistic Regression vs Random Forest vs SVM) matters less than following a systematic process.
Every model follows fit() and predict(). This consistency means I can swap algorithms in 1 line of code and focus on data quality and evaluation.
StandardScaler improved my model performance by normalizing features. Algorithms like Logistic Regression are sensitive to feature magnitude.
By using stratify=y, I ensured balanced classes in both sets. Testing on held-out data gives honest performance estimates.
Even with 93.33% accuracy, I had 2 misclassifications. The confusion matrix showed versicolor/virginica confusionβthey're genuinely more similar species.
Writing "What I Learned" sections forced me to articulate concepts clearly. If I can explain it, I truly understand it.
- Try Random Forest and SVM on Iris dataset
- Compare model performance systematically
- Implement cross-validation for more robust evaluation
- Add hyperparameter tuning examples
- Apply workflow to messier, real-world datasets (missing data, imbalanced classes)
- Build classification project on Kaggle dataset
- Add ensemble methods (voting, stacking)
- Document feature engineering techniques
- Complete end-to-end ML projects (data collection β deployment)
- Build web app for model deployment
- Contribute to open-source ML projects
- Write technical blog posts explaining concepts
Name: [Mani Chelluri]
Location: Secunderabad, Telangana, India
Interests: AI/ML, Cloud Computing, Open Source, Startups
Currently: Building ML skills through structured learning and practical projects
Connect with me:
- πΌ LinkedIn: [https://www.linkedin.com/in/mani-chelluri/]
- π GitHub: [MANICHELLURII]
- π§ Email: [manichelluri9@gmail.com]
- π Portfolio: [manichellurii.github.io]
This project is open source and available under the MIT License.
- Jake VanderPlas - Python Data Science Handbook for ML fundamentals
- Scikit-Learn Documentation - Comprehensive guides and examples
- Google Colab - Free cloud computing for ML experiments
- UCI ML Repository - Iris dataset and countless others
If you found this learning journey helpful:
- β Star this repository
- π΄ Fork it and create your own learning hub
- π’ Share it with others learning ML
- π¬ Open an issue with feedback or questions
Built with π» and β in Secunderabad, India
Last Updated: February 7, 2026