A Flask web service to predict salaries of people in data science field based on information about their job such as work year, experience level, job title, and so on. A Random Forest Regressor is used to predict the salaries. This repo is developed for the ML Zoomcamp Capstone 1 project.
- Python 3.9
- Flask for API service
- Docker for containerization
- AWS Elastic Beanstalk for cloud deployment
.
├── data # Information on dataset
├── notebook.ipynb # All the steps to analyze data and train the model
├── train.py # notebook.ipynb is converted to python script
├── predict.py # Flask application
├── test.py # Send request to application
├── model.bin # Pickled model and dict vectorizer
├── Dockerfile # Dockerfile for containerization
├── Pipfile # Pipfile for dependency management
├── Pipfile.lock
├── aws-deployment-link.png
└── README.md
- Clone the repository:
git clone https://github.com/azad96/Data-Science-Salary-Regression.git
cd Data-Science-Salary-Regression
- Install dependencies using Pipenv:
pip install pipenv
pipenv install
- Activate the environment:
pipenv shell
- Start the Flask server locally:
python predict.py
-
The service will be available at
http://localhost:9696
-
Make sure the host is set to
localhost:9696
in test.py. Then, send POST requests to/predict
endpoint with test.py:
python test.py
Response format: {'predicted_salary': 75000.0}
- Build the image:
docker build -t salary-prediction .
- Run the container:
docker run -it --rm -p 9696:9696 salary-prediction
This will start the Flask server automatically, so you can send a request with test.py as before:
python test.py
The project is configured for AWS Elastic Beanstalk deployment. Use the AWS EB CLI for deployment:
pipenv install --dev
eb init -p docker -r eu-west-1 salary-prediction
eb create salary-prediction-env --enable-spot
When the environment is launched successfully, find the line INFO Application available at URL
in the logs of eb create command.
Copy the URL and set it as the host
variable in test.py. Then, you can send a request by running:
python test.py
When you are done, you can terminate the environment by running:
eb terminate salary-prediction-env