-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preprocessing non-contiguous segments #171
Comments
I don't see any activity here, but I'm wondering if this may have been addressed since Feb? |
Hi @kb1ooo! It's still under works |
@sarahmish thanks. Is there some work on it checked into a branch? |
There isn't an active branch on this case. The primary change for this feature is in the rolling_window_sequences primitive. It currently works by slicing based on indexes. To make this change, we need to introduce slicing by timestamps and using a |
@sarahmish ok right. Is there a simpler intermediate version where basically the data is pre-segmented (i.e. don't delegate the segmentation logic to orion, let it be the responsibility of the caller), and you would pass the data as say a list of dataframes instead of one dataframe? Then just iterate through the list, applying the same pipeline, and concatenate the rolling_window_sequences. |
@kb1ooo that's definitely possible. Mechanically, you can just iterate over each dataframe calling |
Currently most pipelines share the same preprocessing primitives and in the following order:
mlprimitives.custom.timeseries_preprocessing.time_segments_aggregate
this makes the signal equi-spaced based on the specified
interval
.sklearn.impute.SimpleImputer
for imputing missing values.
sklearn.preprocessing.MinMaxScaler
normalizing the data between a specified range.
mlprimitives.custom.timeseries_preprocessing.rolling_window_sequences
creating multiple training window examples based on the
window_size
andstep_size
.However, it is not always the case that we want to make the signal equi-spaced but rather retain the gaps within the signal. For this task, there are two main considerations that need to happen.
max_gap
, then for each segment apply the primitive 1, 2 & 4 shown above, then concatenate them together.the sequence of preprocessing primitives would be:
The text was updated successfully, but these errors were encountered: