Skip to content

Latest commit

 

History

History
46 lines (33 loc) · 3.12 KB

File metadata and controls

46 lines (33 loc) · 3.12 KB

Heliverse-AI-internship-Assessment-Predicting-Employee-Attrition

  • Predicting employee attrition is a multifaceted endeavor involving the comprehensive collection of historical employee data, including demographics, job roles, performance metrics, tenure, and satisfaction surveys. Following data collection, the process entails meticulous preprocessing and exploratory analysis to identify pertinent features and glean insights.

  • Subsequently, various machine learning models, such as logistic regression, decision trees, or neural networks, are trained and evaluated using established metrics like accuracy and F1-score. This rigorous evaluation ensures the selection of an optimal model for deployment.

  • Once validated, the chosen model is deployed to enable real-time predictions, facilitating the implementation of proactive retention strategies and fostering organizational stability. Ongoing monitoring and updates are integral to sustaining the model's efficacy, ensuring its continued effectiveness in addressing dynamic workforce challenges.

Methodology

Here, I am going to use 5 simple steps to analyze Employee Attrition using Python software

  • DATA COLLECTION
  • DATA PRE PROCESSING
  • DIVIDING THE DATA into TWO PARTS “TRAINING” AND “TESTING”
  • BUILD UP THE MODEL USING “TRAINING DATA SET”
  • DO THE ACCURACY TEST USING “TESTING DATA SET”

Setup Instructions

1.Clone the repository:

git clone https://github.com/Ayushverma135/Heliverse-AI-internship-Assessment-Predicting-Employee-Attrition.git

2.Install the required dependencies.

  • Python 3.x: Python is the programming language used for data analysis and model development.

  • pandas: This library is used for data manipulation and analysis, particularly for handling structured data.

  • scikit-learn: Scikit-learn is a machine learning library in Python that provides simple and efficient tools for data mining and data analysis.

  • XGBoost: XGBoost is an optimized gradient boosting library designed for efficiency, speed, and performance.

  • TensorFlow and Keras: These libraries are used for building and training neural network models.

  • matplotlib and seaborn: These libraries are used for data visualization to gain insights from the data.

  • You can install these dependencies using pip, the Python package manager, by running the following commands:

      pip install pandas scikit-learn xgboost tensorflow matplotlib seaborn
    
  • Make sure to replace tensorflow with tensorflow-cpu if you want to install the CPU-only version of TensorFlow to reduce installation size and avoid potential compatibility issues with GPU drivers. 3.Navigate to the notebooks directory Ibm_Employee_Attrition.py and run the Jupyter notebooks.

python Ibm_Employee_Attrition.py
  • If you want to run the Ibm_employee_attrition.ipynb just click on the Run All button in vs code.

Contribution Guidelines

Contributions to improve data preprocessing techniques, explore additional features, or enhance model performance are welcome. Please open an issue to discuss proposed changes or submit a pull request with detailed descriptions of modifications.