The notebook contains a pipeline to build noise detection features on psuedo-periodic signals.
We consider the problem of predicting regime changes under noise in time series data. Access the full story in the blog post on towards data science
We consider the problem of predicting regime changes under noise in time series data. Recover the orange signal from the blue one.
Below is the Duffing oscilator which is the circuit we used to generate the blue signal. The blue signal is obtained by varying the A parameter in the duffing oscillator. We then add noise to it.
To get an idea of which feature sets are the best to predict regime changes we build four models to perform a binary classification task. Each model is built using a different set of features: two sets of features without TDA, one using only TDA features, and one with all the combined features.
In the high noise regime TDA features yielded a significant performance boost over standard feature strategies. TDA not only outperforms the standard strategies alone, it provides a clear performance boost on top of standard strategies when the two are combined.
- Total number of holes: for every time window we calculate a persistence diagram. It allows us to build the Betti surface counts the number of holes present in the data as a function of epsilon and time.
- Relevant holes feature: the relevant holes feature counts the number of holes over a given threshold size (more than 70% of the maximum value).
- Amplitude of the diagram feature: we use the diagram norm as measure of the total persistence of all the holes in the diagram.
- Mean support feature: the mean of the epsilon distances yielding non-zero Betti values in the Betti surface.
- ArgMax feature: the argmax feature is the value of epsilon for which the Betti number was highest for each time window.
- Average lifetime feature: for each dimension we take the average lifetime of a hole in the persistence diagram (=Betti surface at a fixed time).
In order to create the TDA features, we embed our time-series into a higher dimensional space using the Takens’ embedding. Each step of the rolling window is converted into a single vector in higher-dimensional space (the dimension of which is the size of the window).