Student Grade Prediction

This project predicts student grades (G3) based on various features such as parental education, study time, number of failures, and more. Machine learning models including Linear Regression, K-Nearest Neighbors (KNN), and Random Forest are used for making predictions.

Project Structure

The project is organized into separate Python files, each responsible for a specific task. The key components of the project are:

Files

preprocess.py: Contains functions for loading, cleaning, and scaling the dataset.
log_regr.py: Implements the Linear Regression model, trains it, and evaluates its performance.
knn_model.py: Implements the K-Nearest Neighbors (KNN) model, trains it, and evaluates its performance.
random_forest.py: Implements the Random Forest model, trains it, and evaluates its performance.
evaluation.py: Compares the performance of all three models and prints evaluation metrics (MSE, R²).
visualize.py: Provides functions for visualizing the results, including correlation heatmaps and model performance comparison.

Project Workflow

Preprocessing:
- The preprocess.py file handles the loading and preprocessing of the dataset. It cleans the data by selecting only relevant columns and scales the features using StandardScaler.
- The columns used in the model are:
  - Medu, Fedu, goout, Walc, failures, studytime, absences, G1, G2, and G3 (with G3 as the target).
Model Training:
- Models are defined in separate files (log_regr.py, knn_model.py, random_forest.py). Each file:
  - Loads the preprocessed data from preprocess.py.
  - Trains a corresponding machine learning model.
  - Evaluates the model's performance using Mean Squared Error (MSE) and R².
Model Evaluation:
- The evaluation.py file compares the performance of all three models (Linear Regression, KNN, Random Forest) and prints out their MSE and R².
Visualization:
- The visualize.py file provides functions to visualize model performance and feature correlations:
  - A heatmap of feature correlations is plotted to show relationships between features and the target.
  - A bar chart is used to compare the performance of the models (MSE and R²).

Setup and Installation

Prerequisites

To run this project, you'll need to have Python installed along with the following libraries:

pandas
numpy
scikit-learn
matplotlib
seaborn

You can install the required dependencies using pip. Create a virtual environment and run the following command:

pip install pandas numpy scikit-learn matplotlib seaborn

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
__pycache__		__pycache__
README.md		README.md
evaluation.py		evaluation.py
knn_model.py		knn_model.py
log_regr.py		log_regr.py
preprocess.py		preprocess.py
random_forest.py		random_forest.py
requirements.txt		requirements.txt
student-mat.csv		student-mat.csv
visualise.py		visualise.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Student Grade Prediction

Project Structure

Files

Project Workflow

Setup and Installation

Prerequisites

About

Releases

Packages

Languages

sneha30404/Student-Performance-Prediction

Folders and files

Latest commit

History

Repository files navigation

Student Grade Prediction

Project Structure

Files

Project Workflow

Setup and Installation

Prerequisites

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages