Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PDR][Lake] Update predictPredictions and predictSlots to feature lastEventTimestamp #760

Open
2 tasks
idiom-bytes opened this issue Jan 3, 2024 · 2 comments
Labels
Type: Enhancement New feature or request

Comments

@idiom-bytes
Copy link
Member

idiom-bytes commented Jan 3, 2024

Motivation

As part of Incremental Lake + Analytics we want to make it easier to fetch/build our data lake incrementally.

To do so, for [1:n_event] entity tables like predictionPredictions and predictionSlots, we introduce a new parameter lastEventTimestamp .

lastEventTimestamp

I had outlined it as Solution B (and we're still working towards having all data saved on local_lake, but I think this will still reduce a lot of steps/work).

Where the last event to be processed, updates the param like such lastEventTimestamp = max(lastEventTimestamp, newEventTimestamp)

This param can then be filtered via lastEventTimestamp_lte or lastEventTimestamp_gte and enables us to approach all of this with less work and more ease.

DoD:

  • predictPredictions table features lastEventTimestamp and is updated as described above
  • predictSlots table features lastEventTimestamp and is updated as described above
@idiom-bytes idiom-bytes added the Type: Enhancement New feature or request label Jan 3, 2024
@idiom-bytes idiom-bytes changed the title [predictPredictions][last_event_timestamp] - Update predictPredictions schema to enable last_event_timestamp and improved filtering --- Jan 3, 2024
@idiom-bytes idiom-bytes changed the title --- [PDR][Lake] Update predictPredictions and predictSlots to feature lastEventTimestamp Jan 17, 2024
@idiom-bytes
Copy link
Member Author

Here is an example where the predcition.timestamp < payout.timestamp

Adding where clauses to each entity timestamp will work like an && clause across multiple filters (3x filtering/scan) where we really want an || clause across all timestamps, done efficiently (1x filter)

So rather than

  • where timestamp_gte &&
  • where trueval.timestamp_gte &&
  • where payout.timestamp_gte)

Screenshot from 2024-01-17 08-07-10

We have 1 single param lastUpdateTimestamp to filter on, which already takes the max of whatever timestamp happened last.

@idiom-bytes
Copy link
Member Author

One step towards improving performance is using predictPayouts as part of the predictPredictions query. This enables us to fetch all recent predictions and payouts.

Truevals however, must be fetched in a separate query.

query{
  predictPredictions(where:{ or: [{timestamp_gt:100},{payout_:{timestamp_gt:100}}]}){
    slot{
      id
    }
    user {
      id
    }
    stake
    payout{
      
      timestamp
      id
    }
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant