[NAT] Pivot to being a test runner #171

scharissis · 2025-02-13T06:13:32Z

Description
This PR implements the pivot for NAT which makes it more focused on being a test runner.

Key features:

Old acceptance tests removed
Old code for tests/suites/gates removed
Discovers tests (recursively) within a root directory
Tests can be an entire package
If a test doesn't have a package associated with it, NAT searches for it
Handles test skips
Nicer output table (handles skips, shows test duration, shows hierarchy)
A "validator config" YAML is used to describe the relationship between tests/suites/gates. It's intended for this file to also live in the optimism monorepo (location TBD)
It has a simple 'registry' which keeps tracks of the validator config; allows for programmatic/dynamic adjustments as well as validation
Allows for gate inheritance (eg. Gate B can inherit all of Gate A's contents and simply add/edit them)

Video summary:
https://www.loom.com/share/e64a3236ea874fbdafcac67d8233249b?sid=249b72ef-d9eb-4ad1-9e9f-60cb4d82cc0c

CLI interface:

go run cmd/main.go \
  --testdir ../../optimism/ \
  --gate betanet \
  --manifest alpaca.json \
  --validators example-validators.yaml

Example Validators YAML:

gates:
  - id: localnet
    description: "Localnet validation gate"
    suites:
      smoke:
        tests:
          - name: TestInteropSystemNoop
            package: github.com/ethereum-optimism/optimism/kurtosis-devnet/tests/interop
    tests:
      - name: TestFindRPCEndpoints
        package: github.com/ethereum-optimism/optimism/kurtosis-devnet/pkg/kurtosis/api/run


  - id: alphanet
    description: "Alphanet validation gate"
    inherits: ["localnet"]
    suites:
      devnet_system:
        description: "System tests"
        tests:
          - name: TestWallet
            package: github.com/ethereum-optimism/optimism/devnet-sdk/system
          - name: TestChainUser
            package: github.com/ethereum-optimism/optimism/devnet-sdk/system
      kurtosis:
        description: "Kurtosis tests"
        tests:
          - package: github.com/ethereum-optimism/optimism/kurtosis-devnet/pkg/kurtosis

  - id: betanet
    description: "Betanet validation gate"
    inherits: ["alphanet"]
    suites:
      interop:
        description: "Basic network functionality tests"
        tests:
          - name: TestInteropSystemNoop
            package: github.com/ethereum-optimism/optimism/kurtosis-devnet/tests/interop
          - name: FuzzDetectNonBijectivity
            package: github.com/ethereum-optimism/optimism/kurtosis-devnet/tests/interop

Metadata
Resolves #172

jelias2 · 2025-02-13T22:19:36Z

op-nat/README.md

+# Example gate definition
+- id: alphanet
+  description: "Alphanet validation gate"
+  inherits: ["localnet"]


Does the inherits config have to be defined in the same yaml file at this point in time?

Great question. Yes. For now, we only support a single 'validation config' file.

Added this to an idea backlog.

jelias2 · 2025-02-13T22:21:44Z

op-nat/README.md

+
+```yaml
+# Run test 'TestInteropSystem' in package 'github.com/ethereum-optimism/optimism/kurtosis-devnet/tests/interop'
+- name: TestInteropSystem


What about a config example for running a single test file within a package? Is that possible at this time?

File-based isn't possible at this time. It's also unclear to me whether we want it.
For now we can express:

Single test

Entire package

I think down the line, expressing an specific file would be a nice to have. Lets add it to the nat backlog

it might be slightly tricky. I think we'd need to analyze the file to extract the list of individual test cases (as a test file is not always self-contained, so we might need to load the entire module anyway).
Probably easier to start with specifying individual tests

Yes it's non-trivial, otherwise I'd have just added it.
I believe go test doesn't facilitate "run this file". IDE plugins, for example, handle this by inspecting the target file and extracting all the test names within. Then passing that list to go test. Something like go test -run ^(TestA|TestB|TestC) x/y/package

Added this to an idea backlog.

jelias2 · 2025-02-13T22:23:12Z

op-nat/config.go

+	TestDir         string
+	ValidatorConfig string
+	TargetGate      string
+	Wallets         []*wallet.Wallet


Are wallets still necessary in the new version? Shouldn't these be define in the test

You're right. I'll try remove these as part of this PR/change. Thanks.

Done. Removed in 29434e5.

jelias2 · 2025-02-13T22:24:17Z

op-nat/config.go

+	if validatorConfig == "" {
+		return nil, errors.New("validator config path is required")
+	}
+	if gate == "" {


Does gate need to be required? What if the whole config file should be ran?

It's a good question. I figured it'd be better if we have to specify a gate.
Let me know if you want to chat about it though!

Sure feel free to leave a gate for now, however it would be nice to make nat comptabile with just running a single test. Thinking about a case where nat is running on a network and debugging a single failure, It would be conveint for the developer to run a single test without the overhead of having to define a whole gate.

Can be backlogged

I considered that but decided the overhead is trivial. Do you disagree?

They'd add this to fix/debug a single test (say, 'TestFindRPCEndpoints'):

gates: - id: debug tests: - name: TestFindRPCEndpoints package: github.com/ethereum-optimism/optimism/kurtosis-devnet/pkg/kurtosis/api/run

Then run NAT/test-runner with -gate debug

Good with me

jelias2 · 2025-02-18T16:34:18Z

op-nat/config.go

 	// Network config
-	SC         SuperchainManifest
-	Validators []Validator
-
-	Wallets []*wallet.Wallet
-
-	// Networks
+	SC  SuperchainManifest
 	L1  *network.Network
 	L2A *network.Network


I also believe. network objects will be defined in the tests so these can be removed as well

Absolutely. Removed in d808c28.
Thanks!

jelias2 · 2025-02-18T16:35:58Z

op-nat/config.go

+	if validatorConfig == "" {
+		return nil, errors.New("validator config path is required")
+	}
+	if gate == "" {


Sure feel free to leave a gate for now, however it would be nice to make nat comptabile with just running a single test. Thinking about a case where nat is running on a network and debugging a single failure, It would be conveint for the developer to run a single test without the overhead of having to define a whole gate.

Can be backlogged

jelias2 · 2025-02-18T16:39:00Z

op-nat/example-validators.yaml

+      devnet_system:
+        description: "System tests"
+        tests:
+          - name: TestWallet
+            package: github.com/ethereum-optimism/optimism/devnet-sdk/system
+          - name: TestChainUser
+            package: github.com/ethereum-optimism/optimism/devnet-sdk/system


Just curious, are suites able to be defined outside of gate?

Ex. to compose gates of multiple suites, without needing to inherit an sub gate?

Not as-is. Do you think we'll need this?
My reasoning for not needing this is that if we don't "force" suites to be part of gates then they could become stale/unused over time.

eg.:

- suite1: [test1] - suite2: [test2] - suite3: [test3] - gate1: [suite2, suite3]

In this example suite1 and test1 are unused (and thus useless?) (and we may not notice among all the config)

Yeah I get that, I feel as though for composability purposes it may be worth it.

- suite1: [test1] - suite2: [test2] - suite3: [test3] - gate1: [suite2, suite3] - gate2: [suite1, suite3]

It would remove suite3 from being defined multiple times. Especially as the number of gates and test grow.

Thinking similar like ansible with task, plays and playbooks.

Curios what @sigma thinks

Oh! I slightly misunderstood you earlier.

Again, this isn't possible as-is. But we could make it so.

I believe this could be a great improvement, iff., we don't end up using gates such that each is a superset of the previous.
But I'm not sure how the gates will be used right now. My guess right now is that they'll look like this:

alphanet: [stuff] betanet: [alphanet, more-stuff-only-for-betanet] mainnet: [betanet, more-stuff-only-for-mainnet]

In other words, each "stricter" gate MUST include all previous gates' tests/criteria.

It's also worth noting that defining suites/tests outside of gates, for more modularity, would almost definitely be better in the general case of a test-runner.

So this really should be thought about more. The only question is, do we need to think about it in this PR/change? I would argue not.

Thoughts? @jelias2 & @sigma ?

jelias2 · 2025-02-18T16:42:41Z

op-nat/metrics/metrics.go

 	})

-	acceptancesTotal = promauto.NewCounterVec(prometheus.CounterOpts{
+	acceptanceResults = promauto.NewGaugeVec(prometheus.GaugeOpts{


For all of these metrics whats the rational of using Gauges instead of counters? Unless its expected for the metric to have a decrement. Typically counters are used.

I'm refereing back to the convo regarding modeling metrics after an http server. Most of those are counters.

I keep going back and forth on this myself. Can always reassess when we're actually consuming them.
My thinking is essentially:

'totals' should be Counters (number of tests only increases)

'results' perhaps should be Gauges (result of each test goes "up & down" between 0,1 [fail, pass]

Created #194 to address this properly.
Thanks!

jelias2 · 2025-02-18T16:43:27Z

op-nat/metrics/metrics.go

+	acceptanceTestFailed = promauto.NewGaugeVec(prometheus.GaugeOpts{
+		Namespace: MetricsNamespace,
+		Name:      "acceptance_test_failed",
+		Help:      "Number of failed acceptance tests",
+	}, []string{
+		"network_name",
+		"run_id",
+	})


Seems to missing an acceptance test skipped in this file as well

It would also be nice to emit metrics with labels suchs as
gate=<>, suite=<>, test=<>

To all for grouping tests together along with the run_id

Can be backlogged if you like

On skipped validator metrics, created: Issue #195 created

On more labels/dimensions, created: Issue #196 created

jelias2 · 2025-02-18T16:49:58Z

op-nat/metrics/metrics.go

+	acceptanceResults.WithLabelValues(network, runID, result).Set(1)
+	acceptanceTestTotal.WithLabelValues(network, runID).Set(float64(total))
+	acceptanceTestPassed.WithLabelValues(network, runID).Set(float64(passed))
+	acceptanceTestFailed.WithLabelValues(network, runID).Set(float64(failed))
+	acceptanceTestDuration.WithLabelValues(network, runID).Set(duration.Seconds())


hmm, to allow for finer metrics. |

I think total failures and total acceptances should come from aggregating individual test results in the grafana backend, rather than aggegrating them in the service

The query in grafana could look like

sum by (runID) acceptance_test_result{result="passed"}
sum by (runID) acceptance_test_result{result="failed"}

I think the metrics need a little tweaking, but I think this can be backlogged, once its pivoted fully to test runner re-work these

Let's backlog and reconsider how we approach it.

I was treating it like http_requests_total in the prometheus docs here: https://prometheus.io/docs/practices/naming/#labels

Created the following issue to address it: #197

jelias2 · 2025-02-18T16:54:31Z

op-nat/nat.go

+	ctx      context.Context
+	config   *Config
+	version  string
+	registry *registry.Registry
+	runner   runner.TestRunner
+	result   *runner.RunnerResult


Should the logger be apart of the nat object instead of the config object?

jelias2 · 2025-02-18T16:57:41Z

op-nat/nat.go

-	// TODO: This shouldn't be here; needs a refactor
-	// TODO: don't hardcode the network name
-	metrics.RecordAcceptance("todo", runID, overallResult.String())
+	metrics.RecordAcceptance(


Metrics should be emitted as each test case completes instead of when the suite ends to allow for viewing test progression, This may already be the case, I will double check

Agreed. I think that's what we have. But there's a different metric (and helper function) for each.
RecordValidation is used for tests and RecordAcceptance for overall run pass/fail.

jelias2 · 2025-02-18T16:59:38Z

op-nat/registry/registry.go

+	r.mu.Lock()
+	defer r.mu.Unlock()


Just curious what the purpose of the mutex is?

Will keep reviewing... maybe it becomes apparent

Not strictly needed; this is possibly premature optimisation.

It guards the registry.validators slice in case of concurrent access, for example multiple calls to loadValidators. However, as-is, I don't think that'll happen in practise.

jelias2 · 2025-02-18T16:59:55Z

op-nat/registry/registry.go

+
+	// Check for circular inheritance before resolving
+	for _, gate := range config.Gates {
+		if err := r.checkCircularInheritance(gate.ID, gate.Inherits, gateMap, make(map[string]bool)); err != nil {


Nice forward thinking

Thanks. And if we add the 'modular suite definitions' idea you mentioned we should probably add a similar check for unused & duplicate validator definitions.

sigma · 2025-02-18T20:23:47Z

op-nat/README.md

-    },
-}
+### Adding a new test/suite/gate
+All tests, suites, and gates are defined in a `validators.yaml` file. The filename is not important.


is there a (current) requirement for the location of these files?

There is not. Only that there's just one of them.

sigma · 2025-02-18T20:28:59Z

op-nat/README.md

+
+```yaml
+# Run test 'TestInteropSystem' in package 'github.com/ethereum-optimism/optimism/kurtosis-devnet/tests/interop'
+- name: TestInteropSystem


it might be slightly tricky. I think we'd need to analyze the file to extract the list of individual test cases (as a test file is not always self-contained, so we might need to load the entire module anyway).
Probably easier to start with specifying individual tests

sigma · 2025-02-18T20:30:39Z

op-nat/config.go

-			log.Warn("error creating wallet: %w", err)
-		}
-		wallets = append(wallets, w)
+	// Parse kurtosis-devnet manifest


nit: it's meant to not be kurtosis-specific :)

Definitely. I've since removed this actually. The assumption now is that the tests must use devnet-sdk and it will handle the manifest-parsing.

sigma · 2025-02-18T20:33:13Z

op-nat/justfile

 # test: (go_test "./...")
 test:
-    go build ./... && go test -v ./...
+    CGO_ENABLED=0 go build ./... && go test -count=1 -v ./...


I'm curious: why disable the test cache?

Paranoia :)
I do plan to remove this.

sigma · 2025-02-18T20:36:55Z

op-nat/nat.go

-		overallErr = errors.Join(overallErr)
-		if res.Result == ResultFailed {
-			overallResult = ResultFailed
+	t.SetTitle(fmt.Sprintf("NAT Results (%s)", formatDuration(n.result.Duration)))


It would be nice to separate the raw data from its formatting (I can guarantee we'll want a json output in 3... 2.. 1.. :))

Isn't it already though? This function, printResultsTable, simply iterates over nat.result (the data). We could write a similar function printJSON by iterating over the same results data.
Or am I missing something?

sigma · 2025-02-18T20:40:28Z

op-nat/runner/runner.go

+)
+
+// TestResult captures the outcome of a single test run
+type TestResult struct {


should we have some support for test artifacts here? logs for example

Yes, but I was thinking of tackling this in a separate PR. May be non-trivial to parse it nicely.
#198 created to capture the work; please add any more ideas/context to that ticket.

Sound OK?

sigma · 2025-02-18T20:42:34Z

op-nat/runner/runner.go

+// TestResult captures the outcome of a single test run
+type TestResult struct {
+	Metadata types.ValidatorMetadata
+	Status   string


I'm a bit worried about leaving the status free-form. Maybe some enum would be appropriate

Agreed. Updated this to an enum (and moved it to the types package)

sigma · 2025-02-18T20:43:06Z

op-nat/runner/runner.go

+type TestResult struct {
+	Metadata types.ValidatorMetadata
+	Status   string
+	Error    string


I'd use the error interface here. At least that'd leave some room for detecting error types

Agreed; updated.

sigma · 2025-02-18T20:45:34Z

op-nat/runner/runner.go

+	gateResult.Suites[suiteName] = suiteResult
+
+	// Run all tests in the suite
+	for _, validator := range suiteTests {


not urgent at all, but it'd be nice to have some sort of job executor here, to abstract away the loop and make some room for parallel treatment, limitation of the number of workers, and so on.

sigma · 2025-02-18T20:47:26Z

op-nat/runner/runner.go

+
+// listTestsInPackage returns all test names in a package
+func (r *runner) listTestsInPackage(pkg string) ([]string, error) {
+	listCmd := exec.Command("go", "test", pkg, "-list", "^Test")


probably want to make the location of the "go" binary configurable from main. Someone somewhere (probably a fellow nix user) won't have it in the PATH :)

Done; made it configurable (via a flag or envvar).

jelias2

Overall the main functionality for test-running logic is there imo. LG(reat)TM

jelias2 · 2025-02-18T22:02:20Z

op-nat/runner/runner.go

+func (r *runner) runIndividualTest(pkg, testName string) (bool, string) {
+	r.log.Debug("Running individual test", "testName", testName)
+
+	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)


5 Second timeout seems a bit short, If an l2 has 2s block times, thats only 2 blocks.

It would be cool if in the validator.yaml the users could configure a timeout

Changed to 5 minutes.
A per-test timeout would indeed be good. I'll add that to the backlog. Parameterisation in general needs to be reconsidered with this pivot.

jelias2 · 2025-02-18T22:07:10Z

op-nat/runner/runner.go

+}
+
+// buildTestArgs constructs the command line arguments for running a test
+func (r *runner) buildTestArgs(metadata types.ValidatorMetadata) []string {


I thought the idea was to use go test -json?

it was/is. I forgot while implementing it. :/
I've added a backlog item to revisit this if we don't mind.

* Renamed to op-acceptor * Focused on being a test runner

scharissis added the A-acceptance-testing Area: Acceptance Testing label Feb 13, 2025

scharissis requested review from jelias2 and sigma February 13, 2025 06:13

scharissis self-assigned this Feb 13, 2025

scharissis requested a review from a team as a code owner February 13, 2025 06:13

jelias2 reviewed Feb 13, 2025

View reviewed changes

scharissis added this to the [nat] Test Runner Pivot milestone Feb 17, 2025

scharissis requested a review from jelias2 February 18, 2025 02:06

jelias2 reviewed Feb 18, 2025

View reviewed changes

sigma reviewed Feb 18, 2025

View reviewed changes

jelias2 approved these changes Feb 18, 2025

View reviewed changes

sigma approved these changes Feb 19, 2025

View reviewed changes

scharissis force-pushed the scharissis/test-discovery branch from 29434e5 to b670212 Compare February 19, 2025 21:17

This was referenced Feb 20, 2025

[nat] metrics - review Counters vs Gauges #194

Closed

[nat] metrics - add metric for when a test/suite/gate is skipped #195

Closed

[nat] metrics - aggregations in service or prometheus #197

Open

scharissis force-pushed the scharissis/test-discovery branch from 497c9c0 to 76a09e0 Compare February 28, 2025 08:45

scharissis added 10 commits March 4, 2025 16:52

wip: dynamic test discovery

6f35057

wip: dynamic test discovery, via packages

d4b6dbe

allow packages to be specified which runs all tests within them

899b08f

registry tests working

0afb4c5

allow whole packages to be specified. simplified code.

ca1b2e5

run only specified gate

59b019a

gomod

ce4a7e6

validators.yaml

d7e3a50

removed old validator code

0d4698a

improved output prettyness.

351baeb

scharissis added 19 commits March 4, 2025 16:52

handle skipped tests

b873115

simplified flag names

d0a2a90

updated example validators.yaml

1d70721

updated README

9fdec79

gates inherit recursively

5e0b5a7

updated docker compose

67090c8

fixed linter issues

a4bdbe8

removed unused wallet code

8cc7731

removed network code and manifest flag

e354da8

added TestSystemWrapETH test to example validator config

72298ed

changed totals metrics back to Counters

d833a94

moved runID to runner. added temporary validation metric

50db56e

better types

4f0c33f

improved runner.runSingleTest

4aa4f58

improved testresult error type

7e13ac9

added gandalf easter egg

95e6937

go mod tidy

5eb94fc

made path to go binary configurable

c64fde4

renamed service to op-acceptor

7e0d0ed

scharissis force-pushed the scharissis/test-discovery branch from 76a09e0 to 7e0d0ed Compare March 4, 2025 06:24

scharissis merged commit fd01136 into main Mar 4, 2025
34 checks passed

scharissis deleted the scharissis/test-discovery branch March 4, 2025 06:31

scharissis mentioned this pull request Mar 11, 2025

[nat] rename service #173

Closed

raffaele-oplabs pushed a commit that referenced this pull request Aug 3, 2025

[NAT] Pivot to being a test runner (#171)

f332701

* Renamed to op-acceptor * Focused on being a test runner

[NAT] Pivot to being a test runner #171

[NAT] Pivot to being a test runner #171

Uh oh!

Conversation

scharissis commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jelias2 Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scharissis Feb 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scharissis Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jelias2 Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scharissis Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

scharissis commented Feb 13, 2025 •

edited

Loading

jelias2 Feb 13, 2025 •

edited

Loading

scharissis Feb 18, 2025 •

edited

Loading

scharissis Feb 19, 2025 •

edited

Loading

jelias2 Feb 19, 2025 •

edited

Loading

scharissis Feb 19, 2025 •

edited

Loading

scharissis Feb 20, 2025 •

edited

Loading