Skip to content

Conversation

@scharissis
Copy link
Contributor

@scharissis scharissis commented Feb 13, 2025

Description
This PR implements the pivot for NAT which makes it more focused on being a test runner.

Key features:

  • Old acceptance tests removed
  • Old code for tests/suites/gates removed
  • Discovers tests (recursively) within a root directory
  • Tests can be an entire package
  • If a test doesn't have a package associated with it, NAT searches for it
  • Handles test skips
  • Nicer output table (handles skips, shows test duration, shows hierarchy)
  • A "validator config" YAML is used to describe the relationship between tests/suites/gates. It's intended for this file to also live in the optimism monorepo (location TBD)
  • It has a simple 'registry' which keeps tracks of the validator config; allows for programmatic/dynamic adjustments as well as validation
  • Allows for gate inheritance (eg. Gate B can inherit all of Gate A's contents and simply add/edit them)

Video summary:
https://www.loom.com/share/e64a3236ea874fbdafcac67d8233249b?sid=249b72ef-d9eb-4ad1-9e9f-60cb4d82cc0c

CLI interface:

go run cmd/main.go \
  --testdir ../../optimism/ \
  --gate betanet \
  --manifest alpaca.json \
  --validators example-validators.yaml

Example Validators YAML:

gates:
  - id: localnet
    description: "Localnet validation gate"
    suites:
      smoke:
        tests:
          - name: TestInteropSystemNoop
            package: github.com/ethereum-optimism/optimism/kurtosis-devnet/tests/interop
    tests:
      - name: TestFindRPCEndpoints
        package: github.com/ethereum-optimism/optimism/kurtosis-devnet/pkg/kurtosis/api/run


  - id: alphanet
    description: "Alphanet validation gate"
    inherits: ["localnet"]
    suites:
      devnet_system:
        description: "System tests"
        tests:
          - name: TestWallet
            package: github.com/ethereum-optimism/optimism/devnet-sdk/system
          - name: TestChainUser
            package: github.com/ethereum-optimism/optimism/devnet-sdk/system
      kurtosis:
        description: "Kurtosis tests"
        tests:
          - package: github.com/ethereum-optimism/optimism/kurtosis-devnet/pkg/kurtosis

  - id: betanet
    description: "Betanet validation gate"
    inherits: ["alphanet"]
    suites:
      interop:
        description: "Basic network functionality tests"
        tests:
          - name: TestInteropSystemNoop
            package: github.com/ethereum-optimism/optimism/kurtosis-devnet/tests/interop
          - name: FuzzDetectNonBijectivity
            package: github.com/ethereum-optimism/optimism/kurtosis-devnet/tests/interop

Metadata
Resolves #172

@scharissis scharissis added the A-acceptance-testing Area: Acceptance Testing label Feb 13, 2025
@scharissis scharissis requested review from jelias2 and sigma February 13, 2025 06:13
@scharissis scharissis self-assigned this Feb 13, 2025
@scharissis scharissis requested a review from a team as a code owner February 13, 2025 06:13
# Example gate definition
- id: alphanet
description: "Alphanet validation gate"
inherits: ["localnet"]
Copy link
Contributor

@jelias2 jelias2 Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the inherits config have to be defined in the same yaml file at this point in time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question. Yes. For now, we only support a single 'validation config' file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this to an idea backlog.

```yaml
# Run test 'TestInteropSystem' in package 'github.com/ethereum-optimism/optimism/kurtosis-devnet/tests/interop'
- name: TestInteropSystem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about a config example for running a single test file within a package? Is that possible at this time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File-based isn't possible at this time. It's also unclear to me whether we want it.
For now we can express:

  • Single test
  • Entire package

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think down the line, expressing an specific file would be a nice to have. Lets add it to the nat backlog

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be slightly tricky. I think we'd need to analyze the file to extract the list of individual test cases (as a test file is not always self-contained, so we might need to load the entire module anyway).
Probably easier to start with specifying individual tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's non-trivial, otherwise I'd have just added it.
I believe go test doesn't facilitate "run this file". IDE plugins, for example, handle this by inspecting the target file and extracting all the test names within. Then passing that list to go test. Something like go test -run ^(TestA|TestB|TestC) x/y/package

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this to an idea backlog.

op-nat/config.go Outdated
TestDir string
ValidatorConfig string
TargetGate string
Wallets []*wallet.Wallet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are wallets still necessary in the new version? Shouldn't these be define in the test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I'll try remove these as part of this PR/change. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Removed in 29434e5.

op-nat/config.go Outdated
if validatorConfig == "" {
return nil, errors.New("validator config path is required")
}
if gate == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does gate need to be required? What if the whole config file should be ran?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question. I figured it'd be better if we have to specify a gate.
Let me know if you want to chat about it though!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure feel free to leave a gate for now, however it would be nice to make nat comptabile with just running a single test. Thinking about a case where nat is running on a network and debugging a single failure, It would be conveint for the developer to run a single test without the overhead of having to define a whole gate.

Can be backlogged

Copy link
Contributor Author

@scharissis scharissis Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that but decided the overhead is trivial. Do you disagree?

They'd add this to fix/debug a single test (say, 'TestFindRPCEndpoints'):

gates:
  - id: debug
    tests:
      - name: TestFindRPCEndpoints
        package: github.com/ethereum-optimism/optimism/kurtosis-devnet/pkg/kurtosis/api/run

Then run NAT/test-runner with -gate debug

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good with me

@scharissis scharissis added this to the [nat] Test Runner Pivot milestone Feb 17, 2025
@scharissis scharissis requested a review from jelias2 February 18, 2025 02:06
op-nat/config.go Outdated
Comment on lines 27 to 23
// Network config
SC SuperchainManifest
Validators []Validator

Wallets []*wallet.Wallet

// Networks
SC SuperchainManifest
L1 *network.Network
L2A *network.Network
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also believe. network objects will be defined in the tests so these can be removed as well

Copy link
Contributor Author

@scharissis scharissis Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. Removed in d808c28.
Thanks!

op-nat/config.go Outdated
if validatorConfig == "" {
return nil, errors.New("validator config path is required")
}
if gate == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure feel free to leave a gate for now, however it would be nice to make nat comptabile with just running a single test. Thinking about a case where nat is running on a network and debugging a single failure, It would be conveint for the developer to run a single test without the overhead of having to define a whole gate.

Can be backlogged

Comment on lines 20 to 26
devnet_system:
description: "System tests"
tests:
- name: TestWallet
package: github.com/ethereum-optimism/optimism/devnet-sdk/system
- name: TestChainUser
package: github.com/ethereum-optimism/optimism/devnet-sdk/system
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, are suites able to be defined outside of gate?

Ex. to compose gates of multiple suites, without needing to inherit an sub gate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as-is. Do you think we'll need this?
My reasoning for not needing this is that if we don't "force" suites to be part of gates then they could become stale/unused over time.

eg.:

- suite1: [test1]
- suite2: [test2]
- suite3: [test3]

- gate1: [suite2, suite3]

In this example suite1 and test1 are unused (and thus useless?) (and we may not notice among all the config)

Copy link
Contributor

@jelias2 jelias2 Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I get that, I feel as though for composability purposes it may be worth it.

- suite1: [test1]
- suite2: [test2]
- suite3: [test3]

- gate1: [suite2, suite3]
- gate2: [suite1, suite3]

It would remove suite3 from being defined multiple times. Especially as the number of gates and test grow.

Thinking similar like ansible with task, plays and playbooks.

Curios what @sigma thinks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I slightly misunderstood you earlier.

Again, this isn't possible as-is. But we could make it so.

I believe this could be a great improvement, iff., we don't end up using gates such that each is a superset of the previous.
But I'm not sure how the gates will be used right now. My guess right now is that they'll look like this:

alphanet: [stuff]
betanet: [alphanet, more-stuff-only-for-betanet]
mainnet: [betanet, more-stuff-only-for-mainnet]

In other words, each "stricter" gate MUST include all previous gates' tests/criteria.

It's also worth noting that defining suites/tests outside of gates, for more modularity, would almost definitely be better in the general case of a test-runner.

So this really should be thought about more. The only question is, do we need to think about it in this PR/change? I would argue not.

Thoughts? @jelias2 & @sigma ?

})

acceptancesTotal = promauto.NewCounterVec(prometheus.CounterOpts{
acceptanceResults = promauto.NewGaugeVec(prometheus.GaugeOpts{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all of these metrics whats the rational of using Gauges instead of counters? Unless its expected for the metric to have a decrement. Typically counters are used.

I'm refereing back to the convo regarding modeling metrics after an http server. Most of those are counters.

Copy link
Contributor Author

@scharissis scharissis Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep going back and forth on this myself. Can always reassess when we're actually consuming them.
My thinking is essentially:

  • 'totals' should be Counters (number of tests only increases)
  • 'results' perhaps should be Gauges (result of each test goes "up & down" between 0,1 [fail, pass]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #194 to address this properly.
Thanks!

Comment on lines 73 to 80
acceptanceTestFailed = promauto.NewGaugeVec(prometheus.GaugeOpts{
Namespace: MetricsNamespace,
Name: "acceptance_test_failed",
Help: "Number of failed acceptance tests",
}, []string{
"network_name",
"run_id",
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to missing an acceptance test skipped in this file as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be nice to emit metrics with labels suchs as
gate=<>, suite=<>, test=<>

To all for grouping tests together along with the run_id

Can be backlogged if you like

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On skipped validator metrics, created: Issue #195 created

On more labels/dimensions, created: Issue #196 created

Comment on lines 149 to 153
acceptanceResults.WithLabelValues(network, runID, result).Set(1)
acceptanceTestTotal.WithLabelValues(network, runID).Set(float64(total))
acceptanceTestPassed.WithLabelValues(network, runID).Set(float64(passed))
acceptanceTestFailed.WithLabelValues(network, runID).Set(float64(failed))
acceptanceTestDuration.WithLabelValues(network, runID).Set(duration.Seconds())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, to allow for finer metrics. |

I think total failures and total acceptances should come from aggregating individual test results in the grafana backend, rather than aggegrating them in the service

The query in grafana could look like

sum by (runID) acceptance_test_result{result="passed"}
sum by (runID) acceptance_test_result{result="failed"}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the metrics need a little tweaking, but I think this can be backlogged, once its pivoted fully to test runner re-work these

Copy link
Contributor Author

@scharissis scharissis Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's backlog and reconsider how we approach it.

I was treating it like http_requests_total in the prometheus docs here: https://prometheus.io/docs/practices/naming/#labels

Created the following issue to address it: #197

Comment on lines 26 to 30
ctx context.Context
config *Config
version string
registry *registry.Registry
runner runner.TestRunner
result *runner.RunnerResult
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the logger be apart of the nat object instead of the config object?

// TODO: This shouldn't be here; needs a refactor
// TODO: don't hardcode the network name
metrics.RecordAcceptance("todo", runID, overallResult.String())
metrics.RecordAcceptance(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics should be emitted as each test case completes instead of when the suite ends to allow for viewing test progression, This may already be the case, I will double check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think that's what we have. But there's a different metric (and helper function) for each.
RecordValidation is used for tests and RecordAcceptance for overall run pass/fail.

Comment on lines 53 to 54
r.mu.Lock()
defer r.mu.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious what the purpose of the mutex is?

Will keep reviewing... maybe it becomes apparent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly needed; this is possibly premature optimisation.

It guards the registry.validators slice in case of concurrent access, for example multiple calls to loadValidators. However, as-is, I don't think that'll happen in practise.


// Check for circular inheritance before resolving
for _, gate := range config.Gates {
if err := r.checkCircularInheritance(gate.ID, gate.Inherits, gateMap, make(map[string]bool)); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice forward thinking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. And if we add the 'modular suite definitions' idea you mentioned we should probably add a similar check for unused & duplicate validator definitions.

},
}
### Adding a new test/suite/gate
All tests, suites, and gates are defined in a `validators.yaml` file. The filename is not important.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a (current) requirement for the location of these files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not. Only that there's just one of them.

```yaml
# Run test 'TestInteropSystem' in package 'github.com/ethereum-optimism/optimism/kurtosis-devnet/tests/interop'
- name: TestInteropSystem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be slightly tricky. I think we'd need to analyze the file to extract the list of individual test cases (as a test file is not always self-contained, so we might need to load the entire module anyway).
Probably easier to start with specifying individual tests

op-nat/config.go Outdated
log.Warn("error creating wallet: %w", err)
}
wallets = append(wallets, w)
// Parse kurtosis-devnet manifest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's meant to not be kurtosis-specific :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely. I've since removed this actually. The assumption now is that the tests must use devnet-sdk and it will handle the manifest-parsing.

# test: (go_test "./...")
test:
go build ./... && go test -v ./...
CGO_ENABLED=0 go build ./... && go test -count=1 -v ./...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious: why disable the test cache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paranoia :)
I do plan to remove this.

op-nat/nat.go Outdated
overallErr = errors.Join(overallErr)
if res.Result == ResultFailed {
overallResult = ResultFailed
t.SetTitle(fmt.Sprintf("NAT Results (%s)", formatDuration(n.result.Duration)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to separate the raw data from its formatting (I can guarantee we'll want a json output in 3... 2.. 1.. :))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it already though? This function, printResultsTable, simply iterates over nat.result (the data). We could write a similar function printJSON by iterating over the same results data.
Or am I missing something?

)

// TestResult captures the outcome of a single test run
type TestResult struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have some support for test artifacts here? logs for example

Copy link
Contributor Author

@scharissis scharissis Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I was thinking of tackling this in a separate PR. May be non-trivial to parse it nicely.
#198 created to capture the work; please add any more ideas/context to that ticket.

Sound OK?

// TestResult captures the outcome of a single test run
type TestResult struct {
Metadata types.ValidatorMetadata
Status string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried about leaving the status free-form. Maybe some enum would be appropriate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Updated this to an enum (and moved it to the types package)

type TestResult struct {
Metadata types.ValidatorMetadata
Status string
Error string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use the error interface here. At least that'd leave some room for detecting error types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; updated.

gateResult.Suites[suiteName] = suiteResult

// Run all tests in the suite
for _, validator := range suiteTests {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not urgent at all, but it'd be nice to have some sort of job executor here, to abstract away the loop and make some room for parallel treatment, limitation of the number of workers, and so on.


// listTestsInPackage returns all test names in a package
func (r *runner) listTestsInPackage(pkg string) ([]string, error) {
listCmd := exec.Command("go", "test", pkg, "-list", "^Test")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably want to make the location of the "go" binary configurable from main. Someone somewhere (probably a fellow nix user) won't have it in the PATH :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done; made it configurable (via a flag or envvar).

Copy link
Contributor

@jelias2 jelias2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the main functionality for test-running logic is there imo. LG(reat)TM

func (r *runner) runIndividualTest(pkg, testName string) (bool, string) {
r.log.Debug("Running individual test", "testName", testName)

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 Second timeout seems a bit short, If an l2 has 2s block times, thats only 2 blocks.

It would be cool if in the validator.yaml the users could configure a timeout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to 5 minutes.
A per-test timeout would indeed be good. I'll add that to the backlog. Parameterisation in general needs to be reconsidered with this pivot.

}

// buildTestArgs constructs the command line arguments for running a test
func (r *runner) buildTestArgs(metadata types.ValidatorMetadata) []string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the idea was to use go test -json?

Copy link
Contributor Author

@scharissis scharissis Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was/is. I forgot while implementing it. :/
I've added a backlog item to revisit this if we don't mind.

@scharissis scharissis force-pushed the scharissis/test-discovery branch from 76a09e0 to 7e0d0ed Compare March 4, 2025 06:24
@scharissis scharissis merged commit fd01136 into main Mar 4, 2025
34 checks passed
@scharissis scharissis deleted the scharissis/test-discovery branch March 4, 2025 06:31
@scharissis scharissis mentioned this pull request Mar 11, 2025
raffaele-oplabs pushed a commit that referenced this pull request Aug 3, 2025
* Renamed to op-acceptor
* Focused on being a test runner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-acceptance-testing Area: Acceptance Testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[nat] pivot to being a test runner

4 participants