Skip to content

CodeX-SIT/intro-data-processing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction to Data Processing

Introduction

In this assignment, you will be learning about and exploring the basics of data processing. You will be working with a dataset that contains information about the passengers of the Titanic. The Titanic was a British passenger liner that sank in the North Atlantic Ocean in 1912 after hitting an iceberg. Man I hope you know what the titanic was... It'll be sad if you don't lol. The dataset contains information about the passengers who were on board, including whether they survived or not.

Objectives

You will be able to:

  1. Understand and explain what data processing is
  2. Use Python to process data
  3. Use Python to visualize data

Requirements

You must have the following installed on your computer:

  1. A text editor capable of editing Jupyter Notebooks (PyCharm and VSCode are capable of doing this, but VSCode needs some extensions to be installed.)
  2. Python 3.9 or above (Assignment was created using Python 3.11.7)
  3. All the packages mentioned in the requirements.txt file in the repository
  4. Some sanity

The Assignment

In the ipynb given in the repostory, you will find multiple cells which have been marked with an Assignment. These cells have certain tasks for you to perform, and you will have to write code to complete these tasks. example of assignment tag

The tasks range from easy to medium difficulty, where you will have to understand the context of the data and the code which has been written to know what to do.

  • The assignment will work out of the box till cell 40, after which there are errors you must fix.

THIS IS A GROUP ASSIGNMENT meaning that you will be able to work with a group of maximum 4 people of your choice. You can create a new team or join an existing team, depending on what you want to do.

You must ensure that you and your team understand everything about this assignment as we will are tasking you with a presentation at the end of the assignment. The presentation is informal, you do not need to create any slides for the same, but you must be able to explain to us and the rest of the community on how you went about solving the assignment, as well as what the assigment contained.

Submission

The submission deadline is on Sunday, 7th April, 2024. The tentative date for the presentation is Saturday, 6th April, 2024.

You will have to submit the ipynb file with the code written in it, along with a csv file which will be generated automatically, given you have done everything correctly.

Evaluation

The assignment will be evaluated on the following criteria:

  1. Correctness of the code
  2. Understanding of the code

The presentation will be evaluated on the following criteria:

  1. Clarity of explanation
  2. Understanding of the assignment
  • NOTE: The presentation is informal, and you do not need to create any slides for the same. We also do not expect you to be able to explain the prediction model which comes towards the end, but if you are able to, you will be awarded bonus points.

Do your best and good luck!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published