This repository contains per-session and per-session solution files used as learning material for the BSc Psychology "Big Data Analytics in Python" mini module.
All material is taught using the Jupyter Notebook web application and assumes access via the anaconda python distribution.
Files can be downloaded for local use using the clone or download toggle at the top of the repository.
All content has been produced and taught by Jodie Lord. With thanks to Becki Green and Sagar Jilka, King's College London.
- Jupyter Notebook (Session1.ipynb) with learning material.
- Image files which feed into the session1 .ipynb document.
- Jupyter Notebook (session1_solutions.ipynb) with session 1 solutions included.
- .hmtl file with content identical to .ipynb, but in read only format.
- Accessing and understanding Python
- Jupyter basics
- Variable assignment and types
- Basic arithmetic
- Lists and tuples
- Dictionaries
- Become familiar with the Jupyter environment
- Understand and use basic python syntax
- Grasp basic data structures
- Jupyter Notebook (Session2.ipynb) with learning material.
- Image files which feed into the session2 .ipynb document.
- 2 .csv files for pandas related tasks within the session2 learning material.
- Jupyter Notebook (Session2_Solutions.ipynb) with session 2 solutions included.
- .hmtl file with content identical to .ipynb, but in read only format.
- For loops
- Python libraries
- The pandas library, including:
- Reading in files
- Basic data descriptives
- Filtering
- Merging
- Grasp the concept of
loops
and how these can be used to iterate through data - Understand what python libraries are and explore a key data science library: pandas
- Become familiar with some key techniques for working with data, e.g.:
- Merging datasets
- Working with missing data
- Filtering and indexing
- Aggregating data
- Jupyter Notebook (Session3.ipynb) with learning material.
- Image files which feed into the session3 .ipynb document.
- 2 .csv files for pandas and seaborn related tasks.
- Jupyter Notebook (Session3_Solutions.ipynb) with session 3 solutions included.
- .hmtl file with content identical to .ipynb, but in read only format.
- Python charting libraries
- Tests of normality, including:
- Skewness
- Kurtosis
- Sharipo-wilks
- Basic statistics, including:
- T-tests
- ANOVA
- Pearsons correlation
- Explore the world of charting in python via two key charting libraries:
- MatPlotLib
- seaborn
- Learn how to carry out some basic statistics to allow for data exploration.
- Combine learning from charting libraries and statistics to produce meaningful visualisations for statistics produced.
- Jupyter Notebook (Session4.ipynb) with learning material.
- Image folder which contains images which feed into the session4 .ipynb document.
- 1 .csv file for charting and analytics revision tasks
- 1 .xls file for machine learning tasks.
- Jupyter Notebook (Session4_Solutions.ipynb) with session 3 solutions included.
- .hmtl file with content identical to .ipynb, but in read only format.
- Image folder which contains images which feed into the session4 .ipynb document.
- Introduction to machine learning with application in:
- Random Forest
- K nearest neighbours
- Recap of exercises and topics covered over the 4 weeks
- Grasp the basic concept of machine learning.
- Learn some (basic) machine learning techniques.
- Consolidate what we've learnt over the last 4 weeks.
-
EXTRA_RESOURCES.pdf - contains:
- Clarification with examples for FAQs / common confusion points expressed in f2f workshops
- Extra resources to aid additional learning.
-
Revision_Exercises_2020 - contains:
- A .ipynb file with revision exercises for topics covered across the 4 session materials.
- Two data files and one image file which feed into tasks set within the revision exercises material.