New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PDR][Lake] Update predictPredictions and predictSlots to feature lastEventTimestamp #760
Comments
Here is an example where the predcition.timestamp < payout.timestamp Adding where clauses to each entity timestamp will work like an So rather than
We have 1 single param lastUpdateTimestamp to filter on, which already takes the max of whatever timestamp happened last. |
One step towards improving performance is using predictPayouts as part of the predictPredictions query. This enables us to fetch all recent predictions and payouts. Truevals however, must be fetched in a separate query.
|
Motivation
As part of Incremental Lake + Analytics we want to make it easier to fetch/build our data lake incrementally.
To do so, for [1:n_event] entity tables like predictionPredictions and predictionSlots, we introduce a new parameter lastEventTimestamp .
lastEventTimestamp
I had outlined it as Solution B (and we're still working towards having all data saved on local_lake, but I think this will still reduce a lot of steps/work).
Where the last event to be processed, updates the param like such
lastEventTimestamp = max(lastEventTimestamp, newEventTimestamp)
This param can then be filtered via
lastEventTimestamp_lte
orlastEventTimestamp_gte
and enables us to approach all of this with less work and more ease.DoD:
lastEventTimestamp
and is updated as described abovelastEventTimestamp
and is updated as described aboveThe text was updated successfully, but these errors were encountered: