Skip to content

Conversation

@m-Peter
Copy link
Collaborator

@m-Peter m-Peter commented Oct 8, 2025

Closes: #895

Description

For every tx submission, we used to fetch the latest block, as well as the COA account.
This meant that we needed 2 AN calls for each tx submission.
Now we have a ticker which tracks the latest finalized block header, which is mainly used for setting the proper Reference Block ID, when preparing the Cadence tx for submission.
Later on, we plan to remove the COA account fetching as well.


For contributor use:

  • Targeted PR against master branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the standards mentioned here.
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

Summary by CodeRabbit

  • Bug Fixes

    • API server now halts and reports errors when transaction-pool creation fails.
    • Transaction submission error handling improved to avoid failures from stale or missing reference data.
  • New Features

    • Background updater keeps an up-to-date Flow block header for transaction construction.
  • Refactor

    • Initialization flow updated to consistently propagate transaction-pool creation errors.
  • Tests

    • Emulator transaction expiry aligned to Flow’s default for more realistic testing.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 8, 2025

Walkthrough

Transaction pool constructors now return errors; bootstrap aborts startup if pool creation fails. SingleTxPool caches a Flow block header in an atomic value and refreshes it periodically. BatchTxPool and callers updated to accept/use a reference block header. Tests use flow.DefaultTransactionExpiry.

Changes

Cohort / File(s) Summary
Bootstrap error propagation
bootstrap/bootstrap.go
Introduces local err handling for tx-pool creation, captures errors from requester.NewBatchTxPool and requester.NewSingleTxPool, and aborts API startup with a wrapped error if pool creation fails.
SingleTxPool refactor & header tracking
services/requester/single_tx_pool.go
NewSingleTxPool(ctx, ...) now returns (*SingleTxPool, error), accepts ctx, initializes and atomically stores a *flow.BlockHeader via GetLatestBlockHeader, starts a background updater (periodic), replaces per-tx latest-block fetches with cached header, updates buildTransaction to accept *flow.BlockHeader, and adds referenceBlockHeader atomic.Value and referenceBlockUpdateFrequency.
BatchTxPool constructor & header usage
services/requester/batch_tx_pool.go
NewBatchTxPool now returns (*BatchTxPool, error) and propagates errors from NewSingleTxPool. Batch processing callsites updated to use getReferenceBlock()/*flow.BlockHeader for building/submitting transactions; error propagation and messages added.
Test config alignment
tests/helpers.go
Replaced hard-coded transaction expiry (10) with flow.DefaultTransactionExpiry in emulator startup config.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Admin as Bootstrap
  participant STP as SingleTxPool
  participant AN as AccessNode

  rect rgb(240,248,255)
    note over Admin: Startup -> create pools
    Admin->>STP: NewSingleTxPool(ctx, ...)
    STP->>AN: GetLatestBlockHeader()
    AN-->>STP: BlockHeader
    STP-->>Admin: (*SingleTxPool, nil) or error
    alt error
      Admin-->>Admin: abort startup (wrapped error)
    end
  end

  rect rgb(245,255,250)
    note over STP: Background header updater (periodic)
    loop every referenceBlockUpdateFrequency
      STP->>AN: GetLatestBlockHeader()
      AN-->>STP: BlockHeader or error
    end
  end

  rect rgb(255,250,240)
    note over Client, STP: Transaction submission uses cached header
    participant Client
    Client->>STP: Add(tx)
    STP->>STP: use cached BlockHeader (getReferenceBlock)
    STP->>AN: GetAccount(address)
    AN-->>STP: Account
    STP-->>Client: enqueue/submit tx
  end
Loading
sequenceDiagram
  autonumber
  actor Admin as Bootstrap
  participant BTP as BatchTxPool
  participant STP as SingleTxPool

  Admin->>BTP: NewBatchTxPool(ctx, ...)
  BTP->>STP: NewSingleTxPool(ctx, ...)
  alt STP error
    STP-->>BTP: error
    BTP-->>Admin: error
    Admin-->>Admin: abort startup
  else ok
    STP-->>BTP: pool
    BTP-->>Admin: (*BatchTxPool, nil)
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Potential attention areas:

  • Background updater lifecycle, cancellation, and atomic storage of *flow.BlockHeader (services/requester/single_tx_pool.go).
  • All callsites where *flow.Block changed to *flow.BlockHeader (buildTransaction, batch submission) (services/requester/*.go).
  • Error propagation paths added to bootstrap and constructors (bootstrap/bootstrap.go, services/requester/batch_tx_pool.go, services/requester/single_tx_pool.go).

Possibly related PRs

Suggested labels

EVM

Suggested reviewers

  • zhangchiqing
  • peterargue
  • janezpodhostnik

Poem

I hop and fetch the header bright,
I keep it warm through day and night.
When pools complain I sound the bell,
I tick the header — all is well. 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title clearly and concisely summarizes the main change: implementing block header tracking for transaction submission to reduce AN queries.
Linked Issues check ✅ Passed PR successfully implements the core objective from #895: moves latest block fetching into a continuous background process (15-second ticker) and uses tracked header instead of per-request AN queries.
Out of Scope Changes check ✅ Passed All changes are directly related to the stated objective of tracking and using finalized block headers for transaction submission; no extraneous modifications detected.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch mpeter/track-latest-finalized-block-header

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8081978 and 1143ff4.

📒 Files selected for processing (4)
  • bootstrap/bootstrap.go (3 hunks)
  • services/requester/batch_tx_pool.go (6 hunks)
  • services/requester/single_tx_pool.go (9 hunks)
  • tests/helpers.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/helpers.go
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.

Applied to files:

  • bootstrap/bootstrap.go
  • services/requester/single_tx_pool.go
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.

Applied to files:

  • services/requester/single_tx_pool.go
📚 Learning: 2024-10-18T19:26:37.579Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 617
File: api/stream.go:62-67
Timestamp: 2024-10-18T19:26:37.579Z
Learning: In the `flow-evm-gateway` project, within the Go file `api/stream.go`, the `prepareBlockResponse` method includes the Bloom filter as the field `LogsBloom` in the returned `Block` struct.

Applied to files:

  • services/requester/single_tx_pool.go
  • services/requester/batch_tx_pool.go
📚 Learning: 2025-01-24T20:15:10.908Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 682
File: services/ingestion/event_subscriber.go:85-90
Timestamp: 2025-01-24T20:15:10.908Z
Learning: When calculating blocks to catch up in Flow, initialize the counter to 0 and only set it when the latest chain height is greater than the current height (latestOnChainHeight > currentHeight). There's no need to handle the else case as 0 blocks to catch up is the correct default when we're at the same height or ahead.

Applied to files:

  • services/requester/single_tx_pool.go
📚 Learning: 2025-10-06T10:14:49.706Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 890
File: services/ingestion/event_subscriber.go:235-239
Timestamp: 2025-10-06T10:14:49.706Z
Learning: In services/ingestion/event_subscriber.go, when reconnecting after disconnect errors (DeadlineExceeded, Internal, Unavailable), the subscription should reconnect at lastReceivedHeight rather than lastReceivedHeight+1. This avoids errors when the next height doesn't exist yet, and duplicate event processing is safe because the ingestion engine is explicitly designed to be idempotent (storage uses batch.Set() which overwrites existing entries).

Applied to files:

  • services/requester/single_tx_pool.go
🧬 Code graph analysis (3)
bootstrap/bootstrap.go (2)
services/requester/batch_tx_pool.go (1)
  • NewBatchTxPool (52-92)
services/requester/single_tx_pool.go (1)
  • NewSingleTxPool (45-76)
services/requester/single_tx_pool.go (3)
services/requester/tx_pool.go (1)
  • TxPool (16-18)
services/requester/cross-spork_client.go (1)
  • CrossSporkClient (108-113)
services/requester/keystore/key_store.go (1)
  • KeyStore (39-52)
services/requester/batch_tx_pool.go (1)
services/requester/single_tx_pool.go (1)
  • NewSingleTxPool (45-76)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test
🔇 Additional comments (7)
bootstrap/bootstrap.go (1)

256-280: LGTM! Error handling properly surfaces pool initialization failures.

The error handling is correctly implemented:

  • Both batch and single pool creation paths now capture and check errors
  • The error is properly wrapped with context before being returned
  • API server startup will fail-fast if pool initialization fails, preventing the gateway from starting in an inconsistent state

This aligns with the updated pool constructor signatures that now return errors during initialization.

services/requester/batch_tx_pool.go (2)

60-91: LGTM! Error propagation correctly implemented.

The constructor now properly:

  • Returns (*BatchTxPool, error) to surface initialization failures
  • Propagates errors from the embedded SingleTxPool creation
  • Returns nil, err on failure and batchPool, nil on success

This ensures that initialization failures (e.g., unable to fetch initial reference block header) are properly communicated to the caller in bootstrap.


179-249: LGTM! Reference block usage correctly eliminates per-submission Access Node calls.

The changes successfully achieve the PR objective:

  • processPooledTransactions now uses t.getReferenceBlock() instead of fetching the latest block (line 179)
  • batchSubmitTransactionsForSameAddress signature updated to accept referenceBlockHeader (line 196)
  • submitSingleTransaction uses t.getReferenceBlock() (line 249)
  • All calls to buildTransaction now pass the cached reference block header (lines 218, 249)

The cached reference block header is maintained by the embedded SingleTxPool's background updater, eliminating the need for per-submission Access Node queries.

services/requester/single_tx_pool.go (4)

8-8: LGTM! Atomic field correctly declared for thread-safe header storage.

The implementation properly:

  • Imports sync/atomic for atomic operations
  • Declares referenceBlockUpdateFrequency constant (15 seconds, as discussed in past comments)
  • Uses atomic.Value to store *flow.BlockHeader, avoiding data races between the background updater and transaction submission requests
  • Includes a clear comment explaining the thread-safety rationale

Also applies to: 24-25, 37-39


46-76: LGTM! Constructor correctly initializes the reference block and background updater.

The initialization sequence is correct and race-free:

  • Line 46: Added ctx parameter to support the background updater goroutine
  • Lines 54-57: Fetches the initial reference block header and returns an error if unavailable (fail-fast behavior)
  • Line 71: Stores the initial header atomically before starting the updater
  • Line 73: Starts the background updater goroutine that respects context cancellation
  • Line 75: Returns (*SingleTxPool, error) to surface initialization failures

The order ensures no races: the initial reference block is set before any concurrent access is possible, and the background updater will refresh it periodically.


224-250: LGTM! Background updater and accessor correctly implement the polling strategy.

The implementation achieves the PR objective:

getReferenceBlock (lines 224-229):

  • Safely loads the reference block header from atomic storage
  • Defensive nil check (though nil should never occur given initialization)

updateReferenceBlock (lines 231-250):

  • Updates the cached reference header every 15 seconds (appropriate for Flow's 600-second transaction expiry window)
  • Respects context cancellation for clean shutdown (lines 237-238)
  • Fetches the finalized block header via GetLatestBlockHeader(ctx, false) (correct per Flow SDK recommendations)
  • Logs errors without terminating, ensuring resilience (lines 242-245)
  • Stores the new header atomically (line 247)

This eliminates per-submission Access Node queries, reducing load and avoiding rate-limiting under high throughput.


115-115: LGTM! Transaction building now uses the cached reference block.

The changes correctly integrate the cached reference block:

  • Line 115: Add now calls t.getReferenceBlock() instead of fetching the latest block per submission
  • Line 172: buildTransaction parameter changed from latestBlock to referenceBlockHeader (better naming)
  • Line 182: Sets the transaction reference block ID from the cached header
  • Line 209: Uses the cached header height for lock metadata

This achieves the core objective: eliminating per-request latest-block queries to reduce Access Node load and improve performance under high transaction throughput.

Also applies to: 172-172, 182-182, 209-209


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@m-Peter m-Peter changed the title Track the latest finalized block header to avoid fetching it on every tx submission Track the latest finalized block header for tx submission Oct 8, 2025
@m-Peter m-Peter force-pushed the mpeter/track-latest-finalized-block-header branch from da04f2e to dce819d Compare October 8, 2025 08:09
@m-Peter m-Peter marked this pull request as ready for review October 8, 2025 08:17
"github.com/onflow/flow-evm-gateway/services/requester/keystore"
)

var blockHeaderUpdateFrequency = time.Second * 5
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can increase the update frequency (like every 2 or 3 seconds), but since we get the latest finalized (not sealed) block header, I think this is a good value and we're safe. Any objections?

Copy link
Contributor

@peterargue peterargue Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could make it longer. how about 15 seconds? it just needs to be refreshed every 600 seconds, so it doesn't need to be too frequent

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to track the finalized block with polling?

The services/ingestion/engine is subscribing block events for finalized blocks, can we use that to determine a block is finalized?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could make it longer. how about 15 seconds? it just needs to be refreshed every 600 seconds, so it doesn't need to be too frequent

Good point 👍 Updated in 7651bef .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to track the finalized block with polling?

The services/ingestion/engine is subscribing block events for finalized blocks, can we use that to determine a block is finalized?

I thought about that too, but found out some drawbacks.

On main branch we use the Event Streaming API for EVM events, only on the soft-finality branch we subscribe for finalized blocks. Once the optimistic data sync is implemented, then we'll no longer subscribe for finalized blocks, but instead for EVM events on finalized data.

Suppose that the EVM GW has been down for some time, e.g. 30 minutes, for whatever reason (spork / hcu / network issue or something else). When the node is back up, and we start indexing EVM events, if we use the block from the EVM events as the reference block, then those transactions will be considered expired from the Flow network. We'll have to wait for the node to catch up, so that EVM transactions can go through, which is not ideal.

If we track the finalized block with polling, to use as the reference block ID, then we don't have this issue. Transaction submission is no longer dependent on the indexing state of the node.

)
continue
}
t.latestBlockHeader = blockHeader
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need any synchronization mechanism for updating t.latestBlockHeader ?
The goroutine which runs trackLatestBlockHeader is the only place that writes to the t.latestBlockHeader field.
There are various places that read it, and only for setting the ReferenceBlockID when preparing a Cadence tx for submission.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I have stored the latestBlockHeader behind an atomic.Value.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
services/requester/batch_tx_pool.go (2)

185-191: Data race: reading latestBlockHeader without synchronization

processPooledTransactions passes t.latestBlockHeader while SingleTxPool updates it concurrently. Use an accessor backed by atomic/mutex (see SingleTxPool fix).

Apply this diff after adding getLatestBlockHeader() to SingleTxPool:

-					t.latestBlockHeader,
+					t.getLatestBlockHeader(),

261-268: Data race: reading latestBlockHeader without synchronization (single submission)

Replace direct field access with synchronized getter.

-		t.latestBlockHeader,
+		t.getLatestBlockHeader(),
services/requester/single_tx_pool.go (1)

114-121: Use synchronized header read when building tx

Avoid direct field access.

-		t.latestBlockHeader,
+		t.getLatestBlockHeader(),
🧹 Nitpick comments (2)
services/requester/single_tx_pool.go (2)

208-211: Build metadata uses passed header; ensure caller passes synchronized value

No change needed here once callers use getLatestBlockHeader(). Just a heads-up to avoid reading the field here directly.


23-24: Make update frequency configurable

Expose blockHeaderUpdateFrequency via config (with sane default) to tune under different AN latencies.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 25b8222 and dce819d.

📒 Files selected for processing (4)
  • bootstrap/bootstrap.go (3 hunks)
  • services/requester/batch_tx_pool.go (8 hunks)
  • services/requester/single_tx_pool.go (5 hunks)
  • tests/helpers.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
services/requester/batch_tx_pool.go (1)
services/requester/single_tx_pool.go (1)
  • NewSingleTxPool (42-73)
services/requester/single_tx_pool.go (4)
services/requester/cross-spork_client.go (1)
  • CrossSporkClient (108-113)
config/config.go (1)
  • Config (43-124)
services/requester/keystore/key_store.go (1)
  • KeyStore (39-52)
services/requester/tx_pool.go (1)
  • TxPool (16-18)
bootstrap/bootstrap.go (2)
services/requester/batch_tx_pool.go (1)
  • NewBatchTxPool (52-92)
services/requester/single_tx_pool.go (1)
  • NewSingleTxPool (42-73)
🔇 Additional comments (2)
tests/helpers.go (1)

93-95: Good: use default transaction expiry

Switching to flow.DefaultTransactionExpiry removes a magic number and aligns tests with protocol defaults.

bootstrap/bootstrap.go (1)

249-275: Error propagation on tx pool creation is correct

Capturing and returning constructor errors here prevents silent startup with a broken pool. LGTM.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
services/requester/batch_tx_pool.go (2)

187-190: Guard against nil latestBlockHeader before batch submission

If the header is ever unavailable (e.g., transient during startup/tests), buildTransaction will dereference nil. Skip this batch tick when header is nil to avoid noisy errors.

-                err := t.batchSubmitTransactionsForSameAddress(
+                hdr := t.getLatestBlockHeader()
+                if hdr == nil {
+                    t.logger.Warn().Msg("latest Flow block header unavailable; skipping batch tick")
+                    continue
+                }
+                err := t.batchSubmitTransactionsForSameAddress(
                     ctx,
-                    t.getLatestBlockHeader(),
+                    hdr,
                     account,
                     pooledTxs,
                 )

251-266: Early-check latest header in single submission path (optional)

Same nil risk exists here; you can short‑circuit early to avoid counting these as “dropped” due to build failure.

 func (t *BatchTxPool) submitSingleTransaction(
@@
-    flowTx, err := t.buildTransaction(
-        t.getLatestBlockHeader(),
+    hdr := t.getLatestBlockHeader()
+    if hdr == nil {
+        return fmt.Errorf("latest Flow block header unavailable")
+    }
+    flowTx, err := t.buildTransaction(
+        hdr,
         account,
         script,
         hexEncodedTx,
         coinbaseAddress,
     )

Also applies to: 263-268

services/requester/single_tx_pool.go (4)

176-188: Add a defensive nil check for latestBlockHeader

Although you store an initial header, a simple guard prevents panics in edge cases (tests, partial construction, future refactors).

 func (t *SingleTxPool) buildTransaction(
     latestBlockHeader *flow.BlockHeader,
@@
 ) (*flow.Transaction, error) {
+    if latestBlockHeader == nil {
+        return nil, fmt.Errorf("latest Flow block header unavailable")
+    }
@@
     flowTx := flow.NewTransaction().
         SetScript(script).
-        SetReferenceBlockID(latestBlockHeader.ID).
+        SetReferenceBlockID(latestBlockHeader.ID).

Also applies to: 112-120, 240-245


24-25: Make ticker frequency configurable (derive from expiry window)

Hard-coding 5s works, but exposing it via config (or deriving from flow.DefaultTransactionExpiry) lets ops tune staleness vs. load.


219-238: Consider jitter/backoff on header update errors

On repeated AN errors, the fixed 5s tick can spam logs. Add small jitter/backoff after failures to smooth retries.


37-40: Optional: use typed atomic.Pointer for stronger typing

atomic.Value is fine; typed atomics (e.g., go.uber.org/atomic.Pointer[*flow.BlockHeader]) give compile‑time type safety and a helpful String() for safe logging.

Based on learnings

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dce819d and 54a03bb.

📒 Files selected for processing (2)
  • services/requester/batch_tx_pool.go (8 hunks)
  • services/requester/single_tx_pool.go (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
services/requester/batch_tx_pool.go (1)
services/requester/single_tx_pool.go (1)
  • NewSingleTxPool (45-76)
services/requester/single_tx_pool.go (3)
services/requester/tx_pool.go (1)
  • TxPool (16-18)
services/requester/cross-spork_client.go (1)
  • CrossSporkClient (108-113)
config/config.go (1)
  • Config (43-124)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test
🔇 Additional comments (2)
services/requester/single_tx_pool.go (1)

62-76: Good: initialize header atomically before starting tracker

Storing the initial header via atomic before launching the goroutine removes races. Solid.

services/requester/batch_tx_pool.go (1)

60-76: Approve error propagation in BatchTxPool constructor
Errors returned by NewBatchTxPool are handled at all usage sites (including in bootstrap/bootstrap.go), ensuring initialization failures aren’t ignored.

@janezpodhostnik
Copy link
Contributor

Have we thought about having a subscription to the latest finalized block instead of polling it. That way you basically just updated once you get a new block through the subscription channel.

@m-Peter
Copy link
Collaborator Author

m-Peter commented Oct 9, 2025

Have we thought about having a subscription to the latest finalized block instead of polling it. That way you basically just updated once you get a new block through the subscription channel.

@janezpodhostnik I have thought about that indeed. However, subscriptions come with their own complexities too. See for examples this error:

*ingestion.Engine","error":"failure in event subscription with: recoverable: disconnected:
error receiving event: rpc error: code = Unavailable desc = error reading from server: EOF

This is an error that we receive as part of the EVM event subscriber, and it's not trivial to get them to run indefinitely. That's why I opted for a simple polling solution.

}

return latestBlock, account, nil
func (t *SingleTxPool) getLatestBlockHeader() *flow.BlockHeader {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest framing this as getReferenceBlock(). that way future devs don't expect it to return the actual latest block, which this doesn't.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, updated in 3f378f3 , along with all references which implied that we are dealing with the latest block instead, when this is updated every 15 seconds, and only supposed to be used as the reference block ID.

"github.com/onflow/flow-evm-gateway/services/requester/keystore"
)

var blockHeaderUpdateFrequency = time.Second * 5
Copy link
Contributor

@peterargue peterargue Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could make it longer. how about 15 seconds? it just needs to be refreshed every 600 seconds, so it doesn't need to be too frequent

latestBlock *flow.Block
account *flow.Account
)
func (t *SingleTxPool) trackLatestBlockHeader(ctx context.Context) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there other systems that need the latest block? if so, it may make sense to use the block streaming endpoint to track the latest height locally.

If it's only to get the reference block, then I think it's fine to use polling with a medium size interval between requests. 15-30 seconds seems fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than building/signing Flow transactions, there's no other component that needs the latest block. That's why I went with a simple polling approach.

@m-Peter m-Peter force-pushed the mpeter/track-latest-finalized-block-header branch from f447cc8 to 3f378f3 Compare November 6, 2025 11:27
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
services/requester/single_tx_pool.go (1)

24-24: Consider making this unexported or documenting immutability.

The variable referenceBlockUpdateFrequency is never modified and acts as a configuration constant. Since Go doesn't support const for time.Duration, consider either:

  • Making it unexported: referenceBlockUpdateFrequency (lowercase)
  • Adding a comment that it should not be modified

This prevents accidental external modification and clarifies intent.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f447cc8 and 3f378f3.

📒 Files selected for processing (4)
  • bootstrap/bootstrap.go (3 hunks)
  • services/requester/batch_tx_pool.go (8 hunks)
  • services/requester/single_tx_pool.go (8 hunks)
  • tests/helpers.go (1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-06-17T10:29:35.941Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/pool.go:202-214
Timestamp: 2025-06-17T10:29:35.941Z
Learning: In Flow EVM Gateway's transaction pool (services/requester/pool.go), signing keys acquired with keystore.Take() are intentionally held until the transaction is sealed, not released immediately after sending. The key release happens asynchronously when sealed transactions are ingested via event_subscriber.go NotifyTransaction() -> keystore unsafeUnlockKey() -> key.Done(). This pattern ensures transaction integrity by preventing key reuse before transaction finalization.

Applied to files:

  • bootstrap/bootstrap.go
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.

Applied to files:

  • services/requester/single_tx_pool.go
📚 Learning: 2024-10-18T19:26:37.579Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 617
File: api/stream.go:62-67
Timestamp: 2024-10-18T19:26:37.579Z
Learning: In the `flow-evm-gateway` project, within the Go file `api/stream.go`, the `prepareBlockResponse` method includes the Bloom filter as the field `LogsBloom` in the returned `Block` struct.

Applied to files:

  • services/requester/single_tx_pool.go
📚 Learning: 2025-01-24T20:15:10.908Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 682
File: services/ingestion/event_subscriber.go:85-90
Timestamp: 2025-01-24T20:15:10.908Z
Learning: When calculating blocks to catch up in Flow, initialize the counter to 0 and only set it when the latest chain height is greater than the current height (latestOnChainHeight > currentHeight). There's no need to handle the else case as 0 blocks to catch up is the correct default when we're at the same height or ahead.

Applied to files:

  • services/requester/single_tx_pool.go
🧬 Code graph analysis (3)
bootstrap/bootstrap.go (2)
services/requester/batch_tx_pool.go (1)
  • NewBatchTxPool (52-92)
services/requester/single_tx_pool.go (1)
  • NewSingleTxPool (45-76)
services/requester/batch_tx_pool.go (1)
services/requester/single_tx_pool.go (1)
  • NewSingleTxPool (45-76)
services/requester/single_tx_pool.go (2)
services/requester/tx_pool.go (1)
  • TxPool (16-18)
services/requester/cross-spork_client.go (1)
  • CrossSporkClient (108-113)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test
🔇 Additional comments (11)
tests/helpers.go (1)

93-93: LGTM - Good use of SDK constant.

Replacing the hardcoded value with flow.DefaultTransactionExpiry improves maintainability and ensures consistency with the Flow SDK's recommended transaction expiry settings.

services/requester/batch_tx_pool.go (4)

60-91: LGTM - Proper error propagation added.

The constructor now correctly returns errors from NewSingleTxPool, enabling the bootstrap layer to halt startup if pool creation fails. This prevents the gateway from operating with an invalid transaction pool.


168-199: LGTM - Successfully eliminates per-batch block fetch.

The refactor removes the redundant latest block query on each batch interval, relying instead on the periodically-refreshed reference header from getReferenceBlock(). This reduces Access Node load as intended by the PR.


203-245: LGTM - Signature updated to use block header.

The method now accepts latestBlockHeader instead of the full block, which is more precise since only the header (specifically the ID) is needed for setting the reference block. This change is consistent with the overall refactoring.


247-281: LGTM - Consistent use of cached reference block.

Like processPooledTransactions, this method now uses the cached reference header, eliminating an unnecessary Access Node call per single-transaction submission. The refactor is consistent across both submission paths.

bootstrap/bootstrap.go (1)

256-280: LGTM - Bootstrap now fails fast on pool creation errors.

The error handling ensures that if the initial reference block header cannot be fetched during pool creation, the gateway startup is aborted rather than attempting to operate with an uninitialized pool. This is the correct behavior for this critical initialization step.

services/requester/single_tx_pool.go (5)

37-39: LGTM - Well-documented atomic field.

The comment clearly explains the thread-safety rationale, making it easy for future maintainers to understand why atomic.Value is necessary here.


45-76: LGTM - Robust initialization with proper error handling.

The constructor correctly:

  • Fetches the initial reference block header before starting background updates
  • Returns an error if the initial fetch fails, preventing pool creation with invalid state
  • Atomically stores the header before starting the updater goroutine
  • Passes context for graceful shutdown

This ensures the pool is never in an inconsistent state.


112-124: LGTM - Per-transaction AN calls reduced as intended.

This change achieves the core PR objective: the gateway no longer fetches the latest block for every transaction submission. Instead, it uses the cached reference header (refreshed every 15 seconds in the background), significantly reducing Access Node load under high throughput.


219-238: LGTM - Reliable background updater with proper cleanup.

The updater goroutine correctly:

  • Uses a ticker with appropriate cleanup
  • Respects context cancellation for graceful shutdown
  • Logs errors but continues retrying (maintaining availability)
  • Atomically stores the updated header

The 15-second update frequency provides a good balance between AN load and freshness, with ample margin within the ~600-second transaction expiry window.


240-245: LGTM - Correct atomic accessor pattern.

The getter properly loads and type-asserts the atomic value. While it can return nil, this is safe in practice because:

  1. The constructor fails if the initial header fetch fails
  2. The header is atomically stored before the pool is returned to callers
  3. No transactions can be submitted until NewSingleTxPool completes successfully

The current implementation is sound.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
services/requester/single_tx_pool.go (1)

119-119: Add defensive nil check for reference block.

While unlikely in practice (since the constructor ensures initialization), getReferenceBlock() can return nil. Passing this unchecked to buildTransaction would cause a nil pointer dereference at line 187 when accessing latestBlockHeader.ID.

Apply this diff to add a defensive check:

+	referenceBlock := t.getReferenceBlock()
+	if referenceBlock == nil {
+		return fmt.Errorf("reference block not yet available")
+	}
+
 	flowTx, err := t.buildTransaction(
-		t.getReferenceBlock(),
+		referenceBlock,
 		account,
 		script,
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3f378f3 and 5b76505.

📒 Files selected for processing (1)
  • services/requester/single_tx_pool.go (8 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.

Applied to files:

  • services/requester/single_tx_pool.go
📚 Learning: 2024-10-18T19:26:37.579Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 617
File: api/stream.go:62-67
Timestamp: 2024-10-18T19:26:37.579Z
Learning: In the `flow-evm-gateway` project, within the Go file `api/stream.go`, the `prepareBlockResponse` method includes the Bloom filter as the field `LogsBloom` in the returned `Block` struct.

Applied to files:

  • services/requester/single_tx_pool.go
📚 Learning: 2025-01-24T20:15:10.908Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 682
File: services/ingestion/event_subscriber.go:85-90
Timestamp: 2025-01-24T20:15:10.908Z
Learning: When calculating blocks to catch up in Flow, initialize the counter to 0 and only set it when the latest chain height is greater than the current height (latestOnChainHeight > currentHeight). There's no need to handle the else case as 0 blocks to catch up is the correct default when we're at the same height or ahead.

Applied to files:

  • services/requester/single_tx_pool.go
📚 Learning: 2025-10-06T10:14:49.706Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 890
File: services/ingestion/event_subscriber.go:235-239
Timestamp: 2025-10-06T10:14:49.706Z
Learning: In services/ingestion/event_subscriber.go, when reconnecting after disconnect errors (DeadlineExceeded, Internal, Unavailable), the subscription should reconnect at lastReceivedHeight rather than lastReceivedHeight+1. This avoids errors when the next height doesn't exist yet, and duplicate event processing is safe because the ingestion engine is explicitly designed to be idempotent (storage uses batch.Set() which overwrites existing entries).

Applied to files:

  • services/requester/single_tx_pool.go
🧬 Code graph analysis (1)
services/requester/single_tx_pool.go (4)
services/requester/tx_pool.go (1)
  • TxPool (16-18)
services/requester/cross-spork_client.go (1)
  • CrossSporkClient (108-113)
config/config.go (1)
  • Config (44-127)
services/requester/keystore/key_store.go (1)
  • KeyStore (39-52)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test
🔇 Additional comments (6)
services/requester/single_tx_pool.go (6)

8-8: LGTM: Atomic import and update frequency.

The sync/atomic import supports the thread-safe header storage, and the 15-second update frequency is appropriate given the 600-second (10-minute) expiry window for Flow transactions.

Also applies to: 24-24


37-39: LGTM: Atomic field declaration.

The atomic.Value field with explanatory comment correctly addresses the race condition between the ticker updates and concurrent transaction submissions.


45-76: LGTM: Constructor initialization.

The constructor correctly:

  • Fetches the initial reference block header with error handling
  • Atomically stores it before launching the background updater
  • Returns an error to abort startup if the initial fetch fails

This ensures the pool never operates without a valid reference header.


175-217: LGTM: Signature change to header.

The updated signature using *flow.BlockHeader instead of *flow.Block is appropriate since only the header fields (ID and Height) are needed, reducing memory overhead.

Note: This function assumes a non-nil header; nil safety depends on the caller (see comment on line 119).


219-238: LGTM: Background reference block updater.

The implementation correctly:

  • Uses a ticker with the agreed-upon 15-second interval
  • Respects context cancellation for clean shutdown
  • Handles errors by logging and continuing (next attempt in 15 seconds)
  • Stores the updated header atomically

The simple error handling without backoff is acceptable for this polling interval.


240-245: LGTM: Reference block accessor.

The accessor correctly performs an atomic load with a nil check and safe type assertion. The nil check is good defensive programming, even though it shouldn't occur after proper initialization.

@m-Peter m-Peter requested a review from peterargue November 6, 2025 12:03
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
services/requester/single_tx_pool.go (1)

219-245: LGTM: Background updater implementation

The polling mechanism correctly refreshes the reference block header every 15 seconds, respects context cancellation, and uses atomic storage. The getReferenceBlock() accessor safely loads the header with a defensive nil check (though the value is guaranteed non-nil after successful initialization).

Optional: Consider exponential backoff on repeated errors

If GetLatestBlockHeader encounters persistent failures (e.g., during Access Node issues), the current implementation will log an error every 15 seconds. As noted in a previous review comment, introducing exponential backoff with jitter on repeated failures could reduce log spam and Access Node load during outages. This is purely optional and can be deferred.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5b76505 and 8081978.

📒 Files selected for processing (2)
  • services/requester/batch_tx_pool.go (8 hunks)
  • services/requester/single_tx_pool.go (8 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.

Applied to files:

  • services/requester/single_tx_pool.go
📚 Learning: 2024-10-18T19:26:37.579Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 617
File: api/stream.go:62-67
Timestamp: 2024-10-18T19:26:37.579Z
Learning: In the `flow-evm-gateway` project, within the Go file `api/stream.go`, the `prepareBlockResponse` method includes the Bloom filter as the field `LogsBloom` in the returned `Block` struct.

Applied to files:

  • services/requester/single_tx_pool.go
📚 Learning: 2025-01-24T20:15:10.908Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 682
File: services/ingestion/event_subscriber.go:85-90
Timestamp: 2025-01-24T20:15:10.908Z
Learning: When calculating blocks to catch up in Flow, initialize the counter to 0 and only set it when the latest chain height is greater than the current height (latestOnChainHeight > currentHeight). There's no need to handle the else case as 0 blocks to catch up is the correct default when we're at the same height or ahead.

Applied to files:

  • services/requester/single_tx_pool.go
📚 Learning: 2025-10-06T10:14:49.706Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 890
File: services/ingestion/event_subscriber.go:235-239
Timestamp: 2025-10-06T10:14:49.706Z
Learning: In services/ingestion/event_subscriber.go, when reconnecting after disconnect errors (DeadlineExceeded, Internal, Unavailable), the subscription should reconnect at lastReceivedHeight rather than lastReceivedHeight+1. This avoids errors when the next height doesn't exist yet, and duplicate event processing is safe because the ingestion engine is explicitly designed to be idempotent (storage uses batch.Set() which overwrites existing entries).

Applied to files:

  • services/requester/single_tx_pool.go
🧬 Code graph analysis (2)
services/requester/batch_tx_pool.go (1)
services/requester/single_tx_pool.go (1)
  • NewSingleTxPool (45-76)
services/requester/single_tx_pool.go (5)
services/requester/tx_pool.go (1)
  • TxPool (16-18)
services/requester/cross-spork_client.go (1)
  • CrossSporkClient (108-113)
config/config.go (1)
  • Config (44-127)
metrics/collector.go (1)
  • Collector (108-123)
services/requester/keystore/key_store.go (1)
  • KeyStore (39-52)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test
🔇 Additional comments (8)
services/requester/batch_tx_pool.go (4)

60-60: LGTM: Constructor error propagation

The updated constructor signature properly propagates initialization errors from NewSingleTxPool, enabling fail-fast behavior during bootstrap when the Access Node is unavailable or the initial block header fetch fails.

Also applies to: 64-75, 91-91


168-174: LGTM: Removed per-batch block fetch

The batch processing now retrieves only the COA account per batch interval and uses the continuously tracked reference block header via getReferenceBlock(). This change successfully reduces Access Node calls during transaction submission, directly addressing the PR objectives.

Also applies to: 187-187


205-205: LGTM: Signature update aligns with new reference block approach

The parameter type change from *flow.Block to *flow.BlockHeader correctly reflects that only the block header is needed for setting the reference block ID, and aligns with the updated buildTransaction signature in SingleTxPool.

Also applies to: 227-227


251-254: LGTM: Single transaction submission updated consistently

The single transaction path mirrors the batch processing changes—fetching only the COA account and using the tracked reference block. This ensures consistent behavior across both submission paths.

Also applies to: 263-263

services/requester/single_tx_pool.go (4)

24-24: LGTM: Reference block tracking setup

The 15-second update frequency provides a good balance—frequent enough to stay well within the 600-second transaction expiry window while avoiding excessive Access Node load. The atomic storage with clear documentation properly addresses the race condition between the background updater and concurrent transaction submissions.

Also applies to: 37-39


45-76: LGTM: Constructor initialization sequence

The initialization properly sequences the critical steps: fetch the initial finalized block header, store it atomically, then start the background updater. This ensures getReferenceBlock() always returns a valid header after successful construction, and the error return allows bootstrap to abort if the initial fetch fails.


112-124: LGTM: Eliminated per-submission block fetch

This change directly implements the PR objective by replacing the per-transaction latest-block query with a reference to the continuously tracked block header. This reduces Access Node calls from two per submission to one (COA account only), with the account fetch planned for removal in future work.


175-217: LGTM: Transaction building updated correctly

The buildTransaction method now accepts *flow.BlockHeader instead of *flow.Block, correctly using the header's ID for the reference block and its height for lock metadata. The simpler type requirement aligns well with the tracked reference block approach.

@m-Peter m-Peter force-pushed the mpeter/track-latest-finalized-block-header branch from 8081978 to 1143ff4 Compare November 10, 2025 12:15
@m-Peter m-Peter merged commit 404c556 into main Nov 12, 2025
2 checks passed
@m-Peter m-Peter deleted the mpeter/track-latest-finalized-block-header branch November 12, 2025 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Avoid querying AN for latest block on each tx submission

5 participants