core/vm: optimize push2 opcode #31267

MariusVanDerWijden · 2025-02-26T11:40:10Z

During my benchmarks on Holesky, around 10% of all CPU time was spent in PUSH2

ROUTINE ======================== github.com/ethereum/go-ethereum/core/vm.newFrontierInstructionSet.makePush.func1 in github.com/ethereum/go-ethereum/core/vm/instructions.go
    16.38s     20.35s (flat, cum) 10.31% of Total
     740ms      740ms    976:	return func(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
         .          .    977:		var (
      40ms       40ms    978:			codeLen = len(scope.Contract.Code)
     970ms      970ms    979:			start   = min(codeLen, int(*pc+1))
     200ms      200ms    980:			end     = min(codeLen, start+pushByteSize)
         .          .    981:		)
     670ms      2.39s    982:		a := new(uint256.Int).SetBytes(scope.Contract.Code[start:end])
         .          .    983:
         .          .    984:		// Missing bytes: pushByteSize - len(pushData)
     410ms      410ms    985:		if missing := pushByteSize - (end - start); missing > 0 {
         .          .    986:			a.Lsh(a, uint(8*missing))
         .          .    987:		}
    12.69s     14.94s    988:		scope.Stack.push2(*a)
      10ms       10ms    989:		*pc += size
     650ms      650ms    990:		return nil, nil
         .          .    991:	}
         .          .    992:}

Which is quite crazy. We have a handwritten encoder for PUSH1 already, this PR adds one for PUSH2.

PUSH2 is the second most used opcode as shown here: https://gist.github.com/shemnon/fb9b292a103abb02d98d64df6fbd35c8 since it is used by solidity quite significantly. Its used ~20 times as much as PUSH20 and PUSH32.

Benchmarks

BenchmarkPush/makePush-14         	94196547	        12.27 ns/op	       0 B/op	       0 allocs/op
BenchmarkPush/push-14             	429976924	         2.829 ns/op	       0 B/op	       0 allocs/op

This PR also adds a fuzzer to sanity check the handwritten opcode

MariusVanDerWijden · 2025-02-26T11:58:11Z

Back of the envelope calculation, this will save us between 0.2 and 0.5 ms per block, which is quite significant for such a small change

holiman · 2025-02-26T12:04:22Z

core/vm/instructions.go

+		integer = new(uint256.Int)
+	)
+	if *pc+2 < codeLen {
+		scope.Stack.push(integer.SetUint64(uint64(binary.BigEndian.Uint16(scope.Contract.Code[*pc+1 : *pc+3]))))


I'm pretty certain there is a way for uint256 to parse the bytes from scope.Contract.Code which does not go via binary.BigEndian. (If there isn't, I feel like there should be)

Ah, well there's func (z *Int) SetBytes2(in []byte) *Int , but performance-wise it's about the same as what you do here. It would maybe look a little bit cleaner:

scope.Stack.push(integer.SetBytes2(scope.contract.Code[*pc+1:]))

Applied, yep thats a cleaner way

holiman

LGTM

core/vm/instructions.go

Co-authored-by: jwasinger <[email protected]>

core/vm/instructions.go

MariusVanDerWijden · 2025-04-06T12:22:11Z

Did another benchmark, a full block of PUSH2 POP instructions will go from 63745478 ns to 16348237 ns (63ms to 16ms) so a pretty significant improvement

s1na

LGTM!

jwasinger · 2025-04-08T11:47:52Z

It's pretty unfortunate that we have to resort to workarounds like this because the Go compiler is too restrictive to allow disabling the inliner for certain functions..

fjl

Please clean up the tests

During my benchmarks on Holesky, around 10% of all CPU time was spent in PUSH2 ``` ROUTINE ======================== github.com/ethereum/go-ethereum/core/vm.newFrontierInstructionSet.makePush.func1 in github.com/ethereum/go-ethereum/core/vm/instructions.go 16.38s 20.35s (flat, cum) 10.31% of Total 740ms 740ms 976: return func(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) { . . 977: var ( 40ms 40ms 978: codeLen = len(scope.Contract.Code) 970ms 970ms 979: start = min(codeLen, int(*pc+1)) 200ms 200ms 980: end = min(codeLen, start+pushByteSize) . . 981: ) 670ms 2.39s 982: a := new(uint256.Int).SetBytes(scope.Contract.Code[start:end]) . . 983: . . 984: // Missing bytes: pushByteSize - len(pushData) 410ms 410ms 985: if missing := pushByteSize - (end - start); missing > 0 { . . 986: a.Lsh(a, uint(8*missing)) . . 987: } 12.69s 14.94s 988: scope.Stack.push2(*a) 10ms 10ms 989: *pc += size 650ms 650ms 990: return nil, nil . . 991: } . . 992:} ``` Which is quite crazy. We have a handwritten encoder for PUSH1 already, this PR adds one for PUSH2. PUSH2 is the second most used opcode as shown here: https://gist.github.com/shemnon/fb9b292a103abb02d98d64df6fbd35c8 since it is used by solidity quite significantly. Its used ~20 times as much as PUSH20 and PUSH32. # Benchmarks ``` BenchmarkPush/makePush-14 94196547 12.27 ns/op 0 B/op 0 allocs/op BenchmarkPush/push-14 429976924 2.829 ns/op 0 B/op 0 allocs/op ``` --------- Co-authored-by: jwasinger <[email protected]>

- Optimize `opPush2` Refrence: [Geth #31267](ethereum/go-ethereum#31267) - make types consistent in `makeDup` ``` goos: linux goarch: amd64 pkg: github.com/erigontech/erigon/execution/vm cpu: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz │ new_bench.txt │ old_bench.txt │ │ sec/op │ sec/op vs base │ Push/makePush-8 13.23n ± 6% 18.72n ± 12% +41.53% (p=0.000 n=10) Push/push-8 8.531n ± 4% 11.890n ± 44% +39.37% (p=0.000 n=10) geomean 10.62n 14.92n +40.45% │ new_bench.txt │ old_bench.txt │ │ B/op │ B/op vs base │ Push/makePush-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Push/push-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ geomean ² +0.00% ² ¹ all samples are equal ² summaries must be >0 to compute geomean │ new_bench.txt │ old_bench.txt │ │ allocs/op │ allocs/op vs base │ Push/makePush-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Push/push-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ geomean ² +0.00% ² ¹ all samples are equal ² summaries must be >0 to compute geomean ``` Benchmark Test ``` func BenchmarkPush(b *testing.B) { var ( code = common.FromHex("0011223344556677889900aabbccddeeff0102030405060708090a0b0c0d0e0ff1e1d1c1b1a19181716151413121") push2 = makePush(2, 2) callContext = &CallContext{Contract: Contract{ Code: code, }} pc = new(uint64) ) b.Run("makePush", func(b *testing.B) { for i := 0; i < b.N; i++ { push2(*pc, nil, callContext) callContext.Stack.pop() } }) b.Run("push", func(b *testing.B) { for i := 0; i < b.N; i++ { opPush2(*pc, nil, callContext) callContext.Stack.pop() } }) } ```

MariusVanDerWijden added 2 commits February 26, 2025 12:05

core/vm: handwrite push2 opcode

5841899

core/vm: fix off-by-one, add fuzzer

057305a

MariusVanDerWijden requested review from holiman and rjl493456442 as code owners February 26, 2025 11:40

holiman reviewed Feb 26, 2025

View reviewed changes

holiman previously approved these changes Feb 26, 2025

View reviewed changes

core/vm: clean up a bit

787df0f

MariusVanDerWijden dismissed holiman’s stale review via 787df0f February 26, 2025 12:14

jwasinger reviewed Feb 26, 2025

View reviewed changes

core/vm/instructions.go Outdated Show resolved Hide resolved

Update core/vm/instructions.go

6ab7639

Co-authored-by: jwasinger <[email protected]>

rjl493456442 reviewed Feb 27, 2025

View reviewed changes

core/vm/instructions.go Show resolved Hide resolved

s1na previously approved these changes Apr 8, 2025

View reviewed changes

s1na added the status:triage label Apr 8, 2025

fjl removed the status:triage label Apr 8, 2025

fjl reviewed Apr 8, 2025

View reviewed changes

fjl self-assigned this Apr 8, 2025

fjl added this to the 1.15.8 milestone Apr 8, 2025

core/vm: remove unnecessary benchmarks

cf0b39c

MariusVanDerWijden dismissed s1na’s stale review via cf0b39c April 8, 2025 12:58

fjl approved these changes Apr 8, 2025

View reviewed changes

fjl merged commit 5cc9137 into ethereum:master Apr 8, 2025
3 of 4 checks passed

2dvorak mentioned this pull request Apr 10, 2025

blockchain/vm: Bring optimized code for push2 kaiachain/kaia#323

Merged

9 tasks

lmittmann mentioned this pull request Apr 11, 2025

core/vm: optimize push1, push2 opcode #31614

Open

Sahil-4555 mentioned this pull request Dec 16, 2025

execution/vm: optimize push2 opcode erigontech/erigon#18336

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

core/vm: optimize push2 opcode #31267

core/vm: optimize push2 opcode #31267

Uh oh!

MariusVanDerWijden commented Feb 26, 2025

Uh oh!

MariusVanDerWijden commented Feb 26, 2025

Uh oh!

holiman Feb 26, 2025

Uh oh!

holiman Feb 26, 2025

Uh oh!

MariusVanDerWijden Feb 26, 2025

Uh oh!

holiman left a comment

Uh oh!

Uh oh!

Uh oh!

MariusVanDerWijden commented Apr 6, 2025

Uh oh!

s1na left a comment

Uh oh!

jwasinger commented Apr 8, 2025 •

edited

Loading

Uh oh!

fjl left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

core/vm: optimize push2 opcode #31267

core/vm: optimize push2 opcode #31267

Uh oh!

Conversation

MariusVanDerWijden commented Feb 26, 2025

Benchmarks

Uh oh!

MariusVanDerWijden commented Feb 26, 2025

Uh oh!

holiman Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

holiman Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

MariusVanDerWijden Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

holiman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MariusVanDerWijden commented Apr 6, 2025

Uh oh!

s1na left a comment

Choose a reason for hiding this comment

Uh oh!

jwasinger commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fjl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jwasinger commented Apr 8, 2025 •

edited

Loading