Improve time series filtering based on cutoff, horizon and min_context_length #18
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue #, if available:
This PR makes it easier to work with datasets where some datasets are too short for the specific
Task
configuration.Before this PR
Previously, we had the argument
min_ts_length
(defaults tohorizon + 1
) to deal with such datasets. The filtering logic was as follows:min_ts_length
observationscutoff
. If the part beforecutoff
has < 1 observations OR if the part aftercutoff
has <horizon
observations, raise an exception.For example, if some time series are really long, but actually have no observations before the
cutoff
, we will run into errors. It's not trivial to filter them out by settingmin_ts_length
, especially if different time series have different lengths in the dataset.This PR
We replace the
min_ts_length
argument withmin_context_length
(defaults to 1).We change the filtering logic to remove time series if:
min_context_length
observations beforecutoff
horizon
observations aftercutoff
These changes are 100% backwards compatible with the old behavior, but now make it much easier to work with datasets where time series have wildly different lengths / cover different time periods. Specifically:
Other changes
0.5.0rc1
for the pre-releaseBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.