HR Data Analyst

About

You work as an analyst in a company. The company's HR boss provided you with three datasets. The first two contain information about employees' performance in offices A and B: how much they work, their salaries, the number of their projects, departments, and so on. The third one is an extensive dataset with information on the employees' satisfaction with their jobs, their latest evaluation metrics, and the current status in the company. Your task is to analyze the data and answer some of the HR’s questions.

Learning Outcomes of the Project:

Conduct data analysis and handle a case that resembles the actual tasks a data analyst may encounter at their job. Master data merging, grouping, aggregation functions, and draw up pivot tables using the pandas functionality.

Learning Outcomes of Each Stage of the Project:

Stage 1 : Learn how to load data from the XML format, explore, and reindex it properly.

Stage 2 : Practice how to merge several datasets into a big one.

Stage 3 : Master the pandas methods to extract insights from the data.

Stage 4 : Let's try aggregating Pandas DataFrames, which allows you to quickly find different metrics, such as the mean or standard deviation across other columns.

Stage 5 : Explore how to generate pivot tables with Pandas in order to summarize data.

General Info

To learn more about this project, please visit HyperSkill Website - HR Data Analyst.

This project's difficulty has been labelled as Hard where this is how HyperSkill describes each of its four available difficulty levels:

Easy Projects - if you're just starting
Medium Projects - to build upon the basics
Hard Projects - to practice all the basic concepts and learn new ones
Challenging Projects - to perfect your knowledge with challenging tasks

This Repository contains one .py file and one folder:

code.py - Contains the code used to complete the data analysis requirements

Data repository - Contains the three .xml files that contain the data: A_office_data.xml, B_office_data.xml and hr_data.xml

Project was built using python version 3.11.3

Description of Data Sets

For A_office_data.xml and B_office_data.xml:

number_project — number of projects an employee has worked on;
average_monthly_hours — typical workload per month in hours;
time_spend_company — how many years an employee has worked in the company;
Work_accident — whether an employee has had an injury at work;
promotion_last_5years — whether an employee has had any promotions during the last five years;
Department — employee's department;
salary — employee's salary rate;
employee_office_id — employee's ID (1, 2, 3, etc.).

For hr_data.xml:

satisfaction_level — how well an employee performs their job;
last_evaluation — the last evaluation score of an employee;
left — whether an employee has left the company;
employee_id — employee's ID in the company (A125 — from the A office; 125 in this case, is employee_office_id).

How to Run

Download the files to your local repository and open the project in your choice IDE and run the project. The different data frames and their dictionary form will be printed on the console according to the requirements stated in each stage's docstring. Please read each Stage's docstring to know the requirements.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Data		Data
README.md		README.md
code.py		code.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data

Data

README.md

README.md

code.py

code.py

Repository files navigation

HR Data Analyst

About

Learning Outcomes of the Project:

Learning Outcomes of Each Stage of the Project:

General Info

Description of Data Sets

For A_office_data.xml and B_office_data.xml:

For hr_data.xml:

How to Run

About

Releases

Packages

Languages

Nour-Sadek/HR-Data-Analyst

Folders and files

Latest commit

History

Repository files navigation

HR Data Analyst

About

Learning Outcomes of the Project:

Learning Outcomes of Each Stage of the Project:

General Info

Description of Data Sets

For A_office_data.xml and B_office_data.xml:

For hr_data.xml:

How to Run

About

Topics

Resources

Stars

Watchers

Forks

Languages