TFT Beginner needs some insights #1347
Unanswered
ankur-connect
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
I'm working on a large-scale time series forecasting problem using Temporal Fusion Transformers (TFTs), and I’d appreciate guidance on both the modeling philosophy and practical setup.
Philosophical Concern — Does the premise hold?
Forecasting future revenue using:
~2 years of past daily data per time series (~750 days)
Only future-known variable (macroeconomic conditions that too are the estimates)
Scale: ~100,000 time series
Concern:
Revenue (or demand) is often influenced by future events, not just past patterns
Is it philosophically sound to rely on past + very less future context for predicting something that’s highly forward-looking? Is there any EDA that I can do that somewhat addresses the feasibility of this forecasting problem?
Feature Engineering — Am I feeding junk?
Engineered ~250 features including:
Common stat features (mean, std, % change)
Custom types like TVKR, TVUR, SC, SR, etc.
All derived from a small set of primary sources
Questions:
How well does TFT handle large, highly correlated feature sets?
Should I be doing feature selection beforehand, or let the model learn? Or does this whole feature engineering make sense in case of TFT?
Cohorting / Clustering — Should I segment first?
Considering clustering the time series before modeling:
Based on seasonality, trend shape, or statistical profile
Idea: train one TFT per cluster
Questions:
Does clustering help reduce variance or improve convergence with TFTs?
Or is it better to let a global TFT learn from all series together?
Practical Setup — What are the right starting knobs?
Dataset size:
~100,000 time series × 750 timesteps = ~75 million rows
~250 features (past + known future + static)
Planning to use neuralforecast
Questions:
What’s the way to estimate optimal hyperparameters to start with given the data dimensions? For this scale, does using TFT make sense?
Beta Was this translation helpful? Give feedback.
All reactions