Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Load Testing] gherkin load test follow-up #521

Merged
merged 136 commits into from
May 17, 2024

Conversation

red-0ne
Copy link
Contributor

@red-0ne red-0ne commented May 8, 2024

Summary

  • Add batched Proofs and batched Claims txs to support concurrent sessions
  • Add Claims and Proofs counting and assertions
  • Some fixes to further stabilize the tests
  • Fix most of the RelayMiner's go-routine leaks

Issue

Type of change

Select one or more:

  • New feature, functionality or library
  • Bug fix
  • Code health or cleanup
  • Documentation
  • Other (specify)

Testing

Documentation changes (only if making doc changes)

  • make docusaurus_start; only needed if you make doc changes

Local Testing (only if making code changes)

  • Unit Tests: make go_develop_and_test
  • LocalNet E2E Tests: make test_e2e
  • See quickstart guide for instructions

PR Testing (only if making code changes)

  • DevNet E2E Tests: Add the devnet-test-e2e label to the PR.
    • THIS IS VERY EXPENSIVE, so only do it after all the reviews are complete.
    • Optionally run make trigger_ci if you want to re-trigger tests without any code changes
    • If tests fail, try re-running failed tests only using the GitHub UI as shown here

Sanity Checklist

  • I have tested my changes using the available tooling
  • I have commented my code
  • I have performed a self-review of my own code; both comments & source code
  • I create and reference any new tickets, if applicable
  • I have left TODOs throughout the codebase, if applicable

red-0ne and others added 30 commits March 1, 2024 15:08
…roof-integration

* pokt/feat/proof-validation:
  chore: Add godoc comments
  chore: Update keeper query clients in-code documentation
  chore: Address review change requests
  [PubKeyClient] Implement PubKeyClient for on/off-chain usage (#413)
…roof-integration

* pokt/feat/proof-validation:
  fix: test error msg assertion
(cherry picked from commit eab6d984e65a9a07014c9a9839e7b55d125147a1)
* pokt/main:
  [LocalNet] Fix/localnet regenesis (#424)
  [CI] Delete validator pod if stuck during E2E test (#415)
  Fixing reletive link in documentation
  [Docs] Add code review guidelines & developer tips (#426)
  [Quick change] Bump default gas limit (#431)
Co-authored-by: Daniel Olshansky <[email protected]>
Co-authored-by: h5law <[email protected]>
* pokt/main:
  A couple comment improvements
  Update google drive video to youtube video
Copy link
Contributor

@bryanchriswhite bryanchriswhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 🔥

Just a few more small suggestions but otherwise, this is good to go IMO. 🙌

load-testing/tests/relays_stress_helpers_test.go Outdated Show resolved Hide resolved
load-testing/tests/relays_stress_helpers_test.go Outdated Show resolved Hide resolved
load-testing/tests/relays_stress_helpers_test.go Outdated Show resolved Hide resolved
load-testing/tests/relays_stress_helpers_test.go Outdated Show resolved Hide resolved
load-testing/tests/relays_stress_helpers_test.go Outdated Show resolved Hide resolved
load-testing/tests/relays_stress_helpers_test.go Outdated Show resolved Hide resolved
pkg/relayer/session/claim.go Outdated Show resolved Hide resolved
pkg/relayer/session/proof.go Outdated Show resolved Hide resolved
pkg/relayer/session/proof.go Outdated Show resolved Hide resolved
Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few blocking comments that need to be resolved but have to say that the comments are absolute 🔥.

Given the necessary complexity of this, it's the perfect balance!

pkg/relayer/session/session.go Outdated Show resolved Hide resolved
// waiting for block to be committed.
if len(lateSessions) > 0 {
sessionsToClaimsPublishCh <- lateSessions
// If there are any late sessions to be claimed, emit them first.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. I assumed late sessions are too late and shouldn't be claimed, yet here we are trying to claim them.

This makes me question the rest of the business logic in this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsWithinGracePeriod is checking for sessions to claim with <= operator. Which means that it would catch sessions that were supposed to be claimed in previous block heights too.
These late sessions might have their create claim window closed and are no longer eligible to be claimed, but that's not always the case.
Once claim window closing is implemented, they will be filtered out downstream at the appropriate step.

Will put this explanation as a comment but not at this specific line.

load-testing/tests/relays_stress_helpers_test.go Outdated Show resolved Hide resolved
load-testing/tests/relays_stress_helpers_test.go Outdated Show resolved Hide resolved
load-testing/tests/relays_stress_helpers_test.go Outdated Show resolved Hide resolved
load-testing/tests/relays_stress_helpers_test.go Outdated Show resolved Hide resolved
pkg/relayer/session/proof.go Show resolved Hide resolved
@red-0ne red-0ne changed the base branch from test/gherkin-load to main May 16, 2024 08:11
Copy link
Contributor

@bryanchriswhite bryanchriswhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preemptively approving; two last questions/suggustions but otherwise this LGTM! 💪

@@ -17,7 +17,7 @@ import (
// attempting to retrieve the supplier's on-chain record.
// This is useful for testing and development purposes, where the supplier
// may not be staked before the relay miner starts.
const supplierStakeWaitTime = 2
const supplierStakeWaitTime = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend to leave this at 1?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as long as our block time is 2s, other wise the RelayMiner will react too late to its Supplier being freshly staked, missing by the way some RelayRequests that'll fire ASAP.

pkg/relayer/session/session.go Outdated Show resolved Hide resolved
@bryanchriswhite
Copy link
Contributor

bryanchriswhite commented May 16, 2024

@red-0ne maybe run make trigger_ci one time as it looks like CI hasn't run since changing bases.

s.currentProofCount,
"unexpected claims and proofs count",
)
// TODO_TECHDEBT: The current counting mechanism for the expected claims and proofs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much of a concern is this?

Can't tell if it's "hard to do" or a major issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't have full control from within the test over how Gateways distribute their relays or info about which Supplier served a given request.

Although it's possible to do it with some modifications on the AppGateServer, results and failed proofs would anyway be visible on grafana, so I don't believe it to be a big concern.

Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few last-minute questions. Nothing blocking. Love the cleanup into goFoo functions.

Merge it in!

Copy link

The CI will now also run the e2e tests on devnet, which increases the time it takes to complete all CI checks. If you just created a pull request, you might need to push another commit to produce a container image DevNet can utilize to spin up infrastructure. You can use make trigger_ci to push an empty commit.

@github-actions github-actions bot added devnet push-image CI related - pushes images to ghcr.io labels May 16, 2024
@Olshansk
Copy link
Member

@red-0ne Everything is 🟢

@red-0ne red-0ne merged commit d543bb3 into main May 17, 2024
10 checks passed
@Olshansk Olshansk deleted the test/gherkin-load-with-batched-claims branch May 29, 2024 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devnet devnet-test-e2e loadtest Work related to load testing push-image CI related - pushes images to ghcr.io relayminer Changes related to the Relayminer
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

4 participants