Morning 9:30 - 12:00
Afternoon 13:00 - 17:00
- Learn how to read and write files using Python
- Learn how to view, filter, impute, modify, concatenate, reshape data using NumPy and Pandas
- Learn how to create plots using Matplotlib
- Understand what web requests and responses are, and learn how to interact with web APIs
- Jupyter Notebook
- Datasets: UK population, Weekly COVID-19 cases, COVID-19 vaccination data
- Internet connection
9:30 – 9:45 Intro & Review
- Intro
- Looking at the data: UK population, Weekly COVID-19 cases, COVID-19 vaccination data
- Review what we have learned on Day 1
9:45 – 10:15 File Handling
- Open and close a file
- Read and write a file
- JSON serialisation
- Hands-on
10:15 – 10:45 Data Manipulation (NumPy)
- NumPy array
- Create, visit, modify and copy array (and matrix)
- Arithmetic
- Statistics
- Reshape
- Condition-based masking and indexing
- Hands-on
10:45 – 11:00 Break
11:00 – 11:45 Data Manipulation (Pandas)
- Series and DataFrame
- Sorting & statistics
- Missing values
- Create, visit, modify and copy Series and DataFrame
- Hands-on
12: 00 – 13: 00 Lunch Break
13:00 – 14:00 Data Manipulation (Pandas continued, and Visualisation)
- Filter data and make conditional changes
- Saving the data
- Visualisation
- Scope (“canvas”) of matplotlib
- Elements of a figure
- Types of figures: line, pie, histogram/bar, box & Whisker
- Resize and save fig
- Hands-on
14:00 – 14:15 Break
14:15 - 15:30 Interact with Web API
Use: OpenGWAS (semantic), EBI (RESTful), and PubMed (homework / only if there is time left, due to its unreliability)
- Request and Response (What happens during this process)
- HTTP verb
- Status code and contents of responses
- Understanding API docs
- Timeout and error handling
- Hands-on
15:30 - 16:00 Miscellaneous (optional, depending on time)
NOTE: This part was not delivered because we spent more time in the morning reviewing what we learned in Day 1.
Performance and Best Practice
- Searching for an element in Python list, dict and set
- Every little helps but avoid negative optimisation
- Lambda and df.apply(), df.assign()