Skip to content

markwk/ts4health

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Self Across Time: Time Series Data Analysis with Python

Slides and sample code for Time Series Data Analysis, Visualization, Modeling and Forecasting with Python for Health and Self

Talk provides code for time series analysis modeling in general and then applies it to quantified self and fitness tracking data from Fitbit, Apple Watch or Oura.

Contents

Description

How to understand human health across time or an individual self over a lifetime?

In this presentation and code, we look at time series analysis, a sub-field of machine learning and deep learning, using Python, and how it can be applied to tracking data like sleep and exercise from a FitBit, Apple Watch or Oura.

While most often applied to financial, sales, and weather data, time series analysis is also important when we think about health and self data, because before we can start modeling we need to be sure we adjust for any temporal components (like trends, seasonality, or serial correlation).

This talk and various code included provides a high-level yet actionable overview of time series analysis in Python. We look at tests for checking for temporal patterns (like Autocorrelation Plots and ADF) and time series techniques for normalizing or detrending time series data in certain situations. We examine classic time series statistical modeling using Box-Jenkins or ARIMA models, how to set parameters and see if it can be helpful for our health and self data. Finally, we look at Facebook's Prophet for forecasting ts data, including health and fitness data.

How to understand a self across time? Time series analysis allows us to look at non-stationary data like personal data, and translate that data into stationary data when needed. We can then look for patterns and meaning. It enables us to find relevant variables, plot recurring patterns and even make forecasts about trends in our health or productivity. Ultimately TS analysis becomes a powerful tool in the health analysis space for looking at how interventions (like a lifestyle change or treatment) have an impact on an individual (n=1) level.

If we want to go beyond generic advice and personalize our medicine and healthy habits, we need to consider the tempoeral component and time series data analysis can help.

Data

Data collection was done on a combination of wearables (Apple Watch, Fitbit, and Oura). Data aggregation was done using QS Ledger, an open source Python project for collecting and visualization of self-tracking data (Fitbit, Apple Health, Oura, etc). Each data set was then processed and aggregated into a standardized format. For additional information refer to QS Ledger or see my previous speech Python For Self-Trackers for a walkthrough.

Sample data is not being provided openly at this time. Please contact the author if you are in need of reference data or are interested in further data or analysis collaboration.

References

Published References

  • Box & Jenkins. (2015). Time Series Analysis. John Wiley & Sons. (esp Ch 1-4)
  • Pal. (2017). Practical Time Series Analysis. Packt Publishing Ltd. (esp Ch. 1-4)
  • Downey, A. (2015). Think Stats (2nd Edition). O’Reilly Media, Inc. (esp Ch 12)
  • Velicer. (2012). Time series analysis for psychological research. Handbook of Psychology, Second Edition. (Thorough introduction to ts for social scientists)
  • Aigner (2011). Visualization of Time-Oriented Data. Springer Science & Business Media.

Internet Resources

About Speaker

Mark Koester is a tech entrepreneur, writer, and technologist. His current work is at the intersection of data technologies AND human health and optimization. As a data scientist and web and mobile app developer, he is the creator of PhotoStats.io (a photo tracking and analytics app), PodcastTracker for podcast listening logging, Biomarker Tracker (a health analytics service to better understand blood test results) and QS Ledger (the most extensive, open source, personal data collection and analysis tool). Former Regional Lead in Greater China at Techstars, a seed-stage accelerator, and program coordinator at Startup Next (powered by Google for Entrepreneurs). He run a boutique dev shop (Int3c.com) and is an active open source contributor. He regularly writes at www.markwk.com.