Skip to content

chain validation and fix command #198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 22, 2025

Conversation

iuwqyir
Copy link
Collaborator

@iuwqyir iuwqyir commented May 16, 2025

TL;DR

Added a new validate and validateAndFix commands to the indexer to verifiy and fix data integrity issues in the blockchain data.

What changed?

  • Added a new validate command to the CLI that performs data validation
  • Added a new validateAndFix command to the CLI that performs data validation and repair
  • Implemented three validation functions:
    • FindAndRemoveDuplicates: Identifies and removes duplicate blocks, transactions, and logs
    • FindAndFixGaps: Detects and fills in missing blocks in a specified range
    • ValidateAndFixBlocks: Verifies block integrity by checking transaction counts, logs bloom filters, and transaction roots
  • Updated trace-related data structures to use proper types (int64 for trace addresses, uint64 for gas values)
  • Added support for filtering by block numbers when querying aggregates

How to test?

Run the validate command with optional batch size and fix batch size parameters:

# Validate with default batch size (1000)
./indexer validateAndFix

# Validate with custom batch size
./indexer validateAndFix 500

# Validate with custom batch size and fix batch size
./indexer validateAndFix 500 100

Why make this change?

This validation tool helps maintain data integrity in the blockchain indexer by:

  1. Detecting and removing duplicate entries that could cause inconsistencies
  2. Identifying and filling gaps in block sequences
  3. Verifying the correctness of stored block data by recalculating and comparing cryptographic roots
  4. Providing a mechanism to automatically fix corrupted or incorrect data

These validation capabilities are essential for ensuring the reliability and accuracy of the indexed blockchain data.

Summary by CodeRabbit

  • New Features

    • Introduced CLI commands to validate and repair blockchain data with batch processing.
    • Added detection and removal of duplicate blocks, transactions, and logs in block ranges.
    • Implemented validation tools to identify missing blocks, fix gaps, and correct invalid entries.
    • Added blockchain data root and bloom calculation utilities.
    • Introduced progress tracking cursor for validation workflows.
    • Added blockchain data retrieval methods optimized for validation.
    • Enhanced logging for contract ABI parsing errors.
  • Improvements

    • Updated dependencies for enhanced compatibility and performance.
    • Standardized numeric types for blockchain trace and gas fields.
    • Improved data serialization consistency for trace addresses and gas values.
    • Enhanced database queries for consistent reads and filtering by block numbers.
  • Bug Fixes

    • Improved RPC client initialization by handling chain ID setup errors properly.
  • Tests

    • Added mock methods to support testing of validation and data retrieval features.

Copy link
Collaborator Author

iuwqyir commented May 16, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@iuwqyir iuwqyir requested review from AmineAfia and catalyst17 May 16, 2025 07:28
@iuwqyir iuwqyir marked this pull request as ready for review May 16, 2025 07:28
@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch 3 times, most recently from d0eb75b to 7ea3dda Compare May 16, 2025 11:32
@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch from 7ea3dda to bebe17e Compare May 16, 2025 13:39
Copy link

socket-security bot commented May 16, 2025

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatedgithub.com/​ethereum/​go-ethereum@​v1.14.8 ⏵ v1.15.1175100 +210010060
Updatedgithub.com/​holiman/​uint256@​v1.3.1 ⏵ v1.3.2100100100100100

View full report

@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch 5 times, most recently from 8769ad1 to 0ffae31 Compare May 22, 2025 17:21
Copy link

coderabbitai bot commented May 22, 2025

Walkthrough

This update introduces a comprehensive blockchain data validation and repair framework. It adds new CLI commands for validating and fixing blockchain data, implements detection and removal of duplicate records, and provides mechanisms for identifying and correcting missing or invalid blocks. The changes span command definitions, storage interfaces, validation logic, and supporting utilities, with significant additions to the validation and orchestrator modules.

Changes

File(s) Change Summary
cmd/root.go Added two new subcommands validate and validateAndFix to the root command.
cmd/validate.go Added validate CLI command to validate blockchain data integrity over a block range with RPC and storage initialization.
cmd/validate_and_fix.go Added validateAndFix CLI command to validate and repair blockchain data in batches with cursor tracking and ClickHouse connection.
internal/orchestrator/validator.go Added Validator struct with methods for block range validation, fixing invalid blocks, gap detection, and block-level validation logic.
internal/validation/root_calculator.go Added functions to calculate transactions root and logs bloom for validation, supporting multiple Ethereum transaction types.
internal/validation/duplicates.go Implemented detection and removal of duplicate blocks, transactions, and logs in ClickHouse within a block range.
internal/validation/cursor.go Added Cursor type to track validation progress with initialization, update, and persistence methods.
internal/storage/connector.go Extended IMainStorage interface with GetValidationBlockData and FindMissingBlockNumbers methods.
internal/storage/clickhouse.go Implemented GetValidationBlockData and FindMissingBlockNumbers methods; improved query logic for aggregation and staging data; removed .Uint64() calls on gas fields.
internal/common/trace.go Changed TraceAddress from []uint64 to []int64; changed Gas and GasUsed from *big.Int to uint64; updated serialization accordingly.
internal/rpc/serializer.go Updated serialization of gas fields to uint64 and trace addresses to []int64.
internal/rpc/rpc.go Added error handling for setChainID call during RPC client initialization.
internal/common/abi.go Added warning log on ABI parsing failure.
go.mod Updated Go version to 1.23.0; upgraded multiple dependencies including Ethereum and cryptographic libraries; added new dependencies.
test/mocks/MockIMainStorage.go Added mock implementations for new storage interface methods FindMissingBlockNumbers and GetValidationBlockData.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant RPCClient
    participant Storage
    participant ClickHouse
    participant Validator
    participant Cursor

    User->>CLI: Run validateAndFix command
    CLI->>RPCClient: Initialize with URL
    CLI->>Storage: Initialize connector
    CLI->>ClickHouse: Open DB connection
    CLI->>Validator: NewValidator(RPCClient, Storage)
    CLI->>Cursor: InitCursor(chainId, Storage)
    loop For each batch in block range
        CLI->>Validator: validateAndFixRange(start, end, fixBatchSize)
        Validator->>ClickHouse: FindAndRemoveDuplicates(chainId, start, end)
        Validator->>Validator: FindAndFixGaps(start, end)
        Validator->>Validator: ValidateBlockRange(start, end)
        alt Invalid blocks found
            Validator->>Validator: FixBlocks(invalidBlocks, fixBatchSize)
        end
        CLI->>Cursor: Update progress
    end
Loading
sequenceDiagram
    participant User
    participant CLI
    participant RPCClient
    participant Storage
    participant Validator

    User->>CLI: Run validate command
    CLI->>RPCClient: Initialize with URL
    CLI->>Storage: Initialize connector
    CLI->>Validator: NewValidator(RPCClient, Storage)
    CLI->>Validator: ValidateBlockRange(start, end)
    Validator->>Storage: GetValidationBlockData(chainId, start, end)
    Validator->>Validator: ValidateBlocks(blocks)
    Validator->>CLI: Output validation results
Loading

Estimated code review effort

4 (~90 minutes)

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🔭 Outside diff range comments (1)
internal/common/trace.go (1)

60-78: ⚠️ Potential issue

BlockTimestamp is never serialized

TraceModel includes a BlockTimestamp field (uint64) but Serialize() omits it. Downstream consumers will receive zero values.

-		BlockHash:        t.BlockHash,
+		BlockHash:        t.BlockHash,
+		BlockTimestamp:   uint64(t.BlockTimestamp.Unix()),

Please verify the desired unit (seconds vs milliseconds) and update tests accordingly.

♻️ Duplicate comments (2)
internal/validation/cursor.go (1)

37-50: Consider making the cursor file location configurable.

The current implementation writes cursor files to the working directory, which might be problematic in containerized environments with restricted filesystem access.

+// CursorFilePath is the directory where cursor files are stored
+var CursorFilePath string = "."
+
 func (c *Cursor) Update(blockNumber *big.Int) error {
-	cursorFile := fmt.Sprintf("validation_cursor_%s.json", c.ChainId.String())
+	cursorFile := fmt.Sprintf("%s/validation_cursor_%s.json", CursorFilePath, c.ChainId.String())

The same change would need to be applied to the getLastScannedBlock function.

internal/validation/validation.go (1)

163-166: isContractCreation misses the “empty string” case

A contract-creation tx coming from go-ethereum has To == nil, which your DB
may serialise as an empty string. The current check only treats the all-zero
address as creation:

isContractCreation := tx.ToAddress == "0x0000…0000" && tx.Data != "0x"

Consider:

isContractCreation := (tx.ToAddress == "" ||
-	 tx.ToAddress == "0x0000000000000000000000000000000000000000") && tx.Data != "0x"
+	 tx.ToAddress == "0x0000000000000000000000000000000000000000") &&
+	 len(tx.Data) > 2 // guards against meaningless “0x”
🧹 Nitpick comments (5)
cmd/validate.go (2)

78-86: Harden TLS configuration

The explicit tls.Config{} inherits Go’s default (TLS 1.2 for clients). For a CLI tool that only talks to a modern ClickHouse server you can safely raise the floor:

-		TLS:      &tls.Config{},
+		TLS: &tls.Config{
+			MinVersion: tls.VersionTLS13,
+		},

Not critical but improves security posture.


50-52: Off-by-one adjustment is opaque

batchSize = batchSize - 1 relies on readers knowing the inclusive semantics later. Consider renaming the variable or adding a short comment to avoid future errors.

// range end is inclusive, adjust once so we fetch exactly N blocks
batchSizeInclusive := new(big.Int).Sub(batchSize, big.NewInt(1))
internal/validation/validation.go (1)

46-54: *Misleading debug message when printing big.Int values

log.Debug().Msgf("Processing batch of blocks %d to %d", …) prints the decimal
representation of the pointers, not the block numbers, because %d formats
a numeric value, not an object implementing Stringer.
Prefer %s (or %v) to leverage (*big.Int).String():

- log.Debug().Msgf("Processing batch of blocks %d to %d", invalidBlocks[i], invalidBlocks[end-1])
+ log.Debug().Msgf("Processing batch of blocks %s to %s",
+ 	invalidBlocks[i].String(), invalidBlocks[end-1].String())
internal/validation/db.go (1)

120-124: Potential goroutine leak if one query errors early

Each goroutine writes exactly once to its channel, but if an early return
occurs before the send, the receiver blocks forever. Although you currently
return only after sending, future edits could break the guarantee.

A safer pattern is to close the channel once the value is sent or use a
sync.WaitGroup.
Example with a WaitGroup:

var (
   blocksResult blockResult
   …
)
wg.Add(3)
go func() { defer wg.Done(); blocksResult.blocks, blocksResult.err = … }()
…
wg.Wait()
internal/validation/duplicates.go (1)

161-199: removeDuplicateTransactions may exceed ClickHouse parameter limits

The batched deletion is great, but each loop aggregates all hashes in one
IN clause
. If a batch nears 1000 items, ClickHouse can hit the default
max_query_size. Consider:

  • Lowering batchSize (e.g., 100).
  • Or chunking hashes too: execute the same query multiple times with
    smaller slices.

This keeps memory and network utilisation predictable.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f520cfc and 0ffae31.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (11)
  • cmd/root.go (1 hunks)
  • cmd/validate.go (1 hunks)
  • go.mod (5 hunks)
  • internal/common/trace.go (3 hunks)
  • internal/rpc/serializer.go (2 hunks)
  • internal/storage/clickhouse.go (4 hunks)
  • internal/validation/cursor.go (1 hunks)
  • internal/validation/db.go (1 hunks)
  • internal/validation/duplicates.go (1 hunks)
  • internal/validation/gaps.go (1 hunks)
  • internal/validation/validation.go (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
internal/validation/cursor.go (1)
internal/storage/connector.go (1)
  • IStorage (67-71)
🪛 ast-grep (0.38.1)
cmd/validate.go

[warning] 71-71: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

🪛 golangci-lint (1.64.8)
cmd/validate.go

15-15: could not import github.com/thirdweb-dev/indexer/internal/validation (-: # github.com/thirdweb-dev/indexer/internal/validation
internal/validation/gaps.go:27:14: not enough arguments in call to poller.Poll
have ([]*big.Int)
want (context.Context, []*big.Int)
internal/validation/validation.go:56:15: not enough arguments in call to poller.Poll
have ([]*big.Int)
want (context.Context, []*big.Int))

(typecheck)

internal/validation/cursor.go

1-1: : # github.com/thirdweb-dev/indexer/internal/validation
internal/validation/gaps.go:27:14: not enough arguments in call to poller.Poll
have ([]*big.Int)
want (context.Context, []*big.Int)
internal/validation/validation.go:56:15: not enough arguments in call to poller.Poll
have ([]*big.Int)
want (context.Context, []*big.Int)

(typecheck)

🪛 GitHub Check: Test
internal/validation/validation.go

[failure] 56-56:
not enough arguments in call to poller.Poll

internal/validation/gaps.go

[failure] 27-27:
not enough arguments in call to poller.Poll

🪛 GitHub Actions: Go Unit Tests
internal/validation/gaps.go

[error] 27-27: not enough arguments in call to poller.Poll

🔇 Additional comments (12)
internal/validation/cursor.go (3)

12-16: Well-structured cursor type for tracking validation progress.

The Cursor struct provides a good foundation for tracking blockchain validation progress with appropriate fields for maintaining state across validation runs.


18-35: LGTM - Good error handling and initialization logic.

The InitCursor function properly initializes a cursor by retrieving the last scanned block and ensuring it doesn't exceed the maximum block number. The null check on line 24 for maxBlockNumber provides good defensive programming.


52-72: Good error handling for file operations.

The implementation properly handles cases where the cursor file doesn't exist or contains invalid data by defaulting to block 0, allowing the validation to start from the beginning.

go.mod (3)

24-117: Numerous indirect dependency updates.

The PR includes many indirect dependency updates, including new CockroachDB packages, blockchain cryptography libraries, and Go standard library updates. These support the new validation functionality while keeping dependencies current.


7-10:

❓ Verification inconclusive

Significant dependency update for core Ethereum library.

The go-ethereum library has been updated from v1.14.8 to v1.15.11, which is a substantial version jump. Additionally, holiman/uint256 has been moved from an indirect to a direct dependency, which aligns with the type changes in the serializer.


🏁 Script executed:

#!/bin/bash
# Check for breaking changes in the updated go-ethereum version
rg -A 5 -B 5 "github.com/ethereum/go-ethereum" --type go

Length of output: 7911


Review go-ethereum v1.15.11 & holiman/uint256 Upgrade

The bump from v1.14.8→v1.15.11 spans multiple core packages and could introduce breaking changes to RPC clients, common types, trie handling, etc. We also promoted holiman/uint256 to a direct dependency to match updated serializers.

Please verify:

  • All imports of github.com/ethereum/go-ethereum (shown below) still compile and behave as expected.
  • No changed function signatures or removed APIs in v1.15.x affect your code paths.
  • The bump in holiman/uint256 aligns with any type conversions in serializers.

Affected files (imports of go-ethereum packages):

  • internal/validation/validation.go
  • internal/rpc/rpc.go
  • internal/rpc/params.go
  • internal/rpc/batcher.go
  • internal/storage/clickhouse.go
  • internal/middleware/authorization.go
  • internal/common/transaction.go, abi.go, log.go
  • internal/handlers/transactions_handlers.go, logs_handlers.go
  • internal/common/log_test.go, transaction_test.go

Actions:

  1. Review the go-ethereum v1.15.11 release notes for breaking changes.
  2. Run the full test suite (unit + integration) against a node with v1.15.x.
  3. Ensure all ABI/account/rpc calls still serialize/deserialize correctly.

3-3:

❌ Incorrect review comment

Verify Go version compatibility.

The Go version has been updated to 1.23.0, which appears to be a future version since the current stable version at the time of this review is 1.22.x.


🌐 Web query:

What is the latest stable version of Go available in May 2025?

💡 Result:

As of May 22, 2025, the latest stable version of Go is 1.24.3, released on April 30, 2025. (go.dev) This release includes security fixes to the os package, as well as bug fixes to the runtime, compiler, linker, go command, and the crypto/tls and os packages. (golang.es)

Citations:


Incorrect Go version assumption in go.mod

The requirement go 1.23.0 is not a future release. As of May 22, 2025 the latest stable Go is 1.24.3 (released April 30, 2025), and Go 1.23.0 was already available prior to that. Your go.mod entry is valid.

If you need fixes or features from the newest release, you may optionally bump to:

- go 1.23.0
+ go 1.24

• File: go.mod (line 3)

Likely an incorrect or invalid review comment.

cmd/root.go (1)

201-201: Good integration of the new validation command.

The new validateCmd is properly added to the root command, following the same pattern as existing commands.

internal/rpc/serializer.go (2)

382-383: Type optimization for gas values.

Changing from big.Int to uint64 for gas values is a good optimization as Ethereum gas values typically fit comfortably within uint64 range, reducing memory usage and improving performance.


402-411:

✅ Verification successful

Type change for trace addresses.

The type change from uint64 to int64 for trace addresses indicates a design decision to allow for potential negative indices or to align with other systems' expectations.


🏁 Script executed:

#!/bin/bash
# Check if there are corresponding changes in other files related to trace address types
rg -A 3 -B 3 "TraceAddress.*\[\]int64" --type go

Length of output: 1593


serializeTraceAddress type change is consistent

serializeTraceAddress now returns []int64, which matches the TraceAddress field defined as []int64 in internal/common/trace.go. No further updates are required.

  • internal/common/trace.go: TraceAddress []int64 confirmed.
internal/storage/clickhouse.go (2)

375-377: Gas fields handled correctly after type change – good job

trace.Gas and trace.GasUsed are now inserted directly as uint64, matching the new struct definition and saving an unnecessary allocation / dereference.
No further action needed.


936-938: FINAL modifier ensures consistent reads

Adding FINAL to the staging-data query is a good call – it prevents race conditions with background merges in ClickHouse.
👍

internal/common/trace.go (1)

16-24: Type migration looks consistent

Switching TraceAddress to []int64 and gas fields to uint64 simplifies marshaling and DB inserts. Just ensure callers no longer rely on signedness of TraceAddress (previously uint64).

@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch from 0ffae31 to 31e733b Compare May 22, 2025 17:27
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (3)
internal/validation/validation.go (2)

286-295: ⚠️ Potential issue

*Handle nil big.Int inputs to uint256.FromBig to prevent panics

The code doesn't check for nil values before converting big.Int to uint256.Int, which could cause panics for older blocks or special transactions where some fields may be nil.

Implement a helper function at the beginning of the file:

// safeUint256 converts a *big.Int to *uint256.Int safely, handling nil values
func safeUint256(b *big.Int) *uint256.Int {
    if b == nil {
        return new(uint256.Int)
    }
    out, _ := uint256.FromBig(b)
    return out
}

Then replace all the conversions:

-chainID, _ := uint256.FromBig(tx.ChainId)
-gasTipCap, _ := uint256.FromBig(tx.MaxPriorityFeePerGas)
-gasFeeCap, _ := uint256.FromBig(tx.MaxFeePerGas)
-value, _ := uint256.FromBig(tx.Value)
-blobFeeCap, _ := uint256.FromBig(tx.MaxFeePerBlobGas)
-v, _ := uint256.FromBig(tx.V)
-r, _ := uint256.FromBig(tx.R)
-s, _ := uint256.FromBig(tx.S)
+chainID := safeUint256(tx.ChainId)
+gasTipCap := safeUint256(tx.MaxPriorityFeePerGas)
+gasFeeCap := safeUint256(tx.MaxFeePerGas)
+value := safeUint256(tx.Value)
+blobFeeCap := safeUint256(tx.MaxFeePerBlobGas)
+v := safeUint256(tx.V)
+r := safeUint256(tx.R)
+s := safeUint256(tx.S)

Also apply the same pattern to lines 332-338.


56-58: ⚠️ Potential issue

Check error return from poller.Poll

The Poll method likely returns an error that is not being checked. This could silently ignore failures in the polling process.

poller := orchestrator.NewBoundlessPoller(rpcClient, s)
-poller.Poll(context.Background(), batch)
+if err := poller.Poll(context.Background(), batch); err != nil {
+    return fmt.Errorf("failed to poll blocks %v: %w", batch, err)
+}
log.Debug().Msgf("Batch of invalid blocks polled: %d to %d", invalidBlocks[i], invalidBlocks[end-1])
internal/validation/duplicates.go (1)

213-229: 🛠️ Refactor suggestion

String-built IN clause risks gigantic queries & SQL-injection

Building the IN clause directly with fmt.Sprintf could lead to oversized queries if there are many tuples, and potentially allows SQL injection vulnerabilities even though the values come from the database.

Consider one of these approaches:

  1. Use ClickHouse array parameters instead:
-tuples := make([]string, 0, len(batch))
+txHashes := make([]string, 0, len(batch))
+logIndices := make([]uint64, 0, len(batch))

for _, duplicateLog := range batch {
  blockNumbers = append(blockNumbers, duplicateLog.BlockNumber)
-  tuples = append(tuples, fmt.Sprintf("('%s', %d)", duplicateLog.TxHash, duplicateLog.LogIndex))
+  txHashes = append(txHashes, duplicateLog.TxHash)
+  logIndices = append(logIndices, duplicateLog.LogIndex)
}

-query := fmt.Sprintf(`WITH
+query := `WITH
  to_be_inserted AS (
    SELECT chain_id, block_number, block_hash, block_timestamp, transaction_hash, transaction_index, log_index, address,
      data, topic_0, topic_1, topic_2, topic_3, insert_timestamp, -sign as sign
    FROM default.logs FINAL
-    WHERE chain_id = ? AND block_number IN (?) AND (transaction_hash, log_index) IN (%s)
+    WHERE chain_id = ? AND block_number IN (?) AND (transaction_hash, log_index) IN ZIP(?, ?)
  )
  INSERT INTO logs (
    chain_id, block_number, block_hash, block_timestamp, transaction_hash, transaction_index, log_index, address,
    data, topic_0, topic_1, topic_2, topic_3, insert_timestamp, sign
  ) SELECT * from to_be_inserted
-`, strings.Join(tuples, ","))
+`

-err := conn.Exec(context.Background(), query, chainId, blockNumbers)
+err := conn.Exec(context.Background(), query, chainId, blockNumbers, txHashes, logIndices)
🧹 Nitpick comments (3)
cmd/validate.go (1)

19-26: Update the TBD command documentation

The command's Short and Long descriptions are set to "TBD" which should be updated with meaningful descriptions to help users understand the purpose and functionality of the validate command.

var (
	validateCmd = &cobra.Command{
-		Use:   "validate",
-		Short: "TBD",
-		Long:  "TBD",
+		Use:   "validate [batchSize] [fixBatchSize]",
+		Short: "Validates and fixes blockchain data integrity",
+		Long:  "Validates blockchain data integrity by detecting and removing duplicates, filling gaps, and verifying block data. Takes optional batch size (default: 1000) and fix batch size (default: 0) arguments.",
		Run: func(cmd *cobra.Command, args []string) {
			RunValidate(cmd, args)
		},
	}
)
internal/validation/validation.go (2)

164-166: Better contract creation detection

The contract creation detection logic relies on checking for the zero address string and non-empty data, which could be error-prone. Consider using a more robust approach.

-isContractCreation := tx.ToAddress == "0x0000000000000000000000000000000000000000" && tx.Data != "0x"
+// Contract creation transactions have empty "to" field, but in the database they might be
+// stored with the zero address. Check for both empty address and zero address.
+isContractCreation := (tx.ToAddress == "" || tx.ToAddress == "0x0000000000000000000000000000000000000000") && tx.Data != "0x"

98-104: Add comments explaining transaction type checking

The code skips transaction root validation for transaction types > 4, but doesn't explain why. Add comments to clarify which transaction types are supported and why others are skipped.

// TODO: remove this once we know how to validate all tx types
for _, tx := range blockData.Transactions {
-    if tx.TransactionType > 4 { // Currently supported types are 0-4
+    // Currently supported transaction types:
+    // 0: Legacy
+    // 1: Access List (EIP-2930)
+    // 2: Dynamic Fee (EIP-1559)
+    // 3: Blob (EIP-4844)
+    // 4: Set Code (EIP-7702)
+    // Skip validation for newer, unsupported transaction types
+    if tx.TransactionType > 4 {
        log.Warn().Msgf("Skipping transaction root validation for block %s due to unsupported transaction type %d", blockData.Block.Number, tx.TransactionType)
        return nil
    }
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0ffae31 and 31e733b.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (11)
  • cmd/root.go (1 hunks)
  • cmd/validate.go (1 hunks)
  • go.mod (5 hunks)
  • internal/common/trace.go (3 hunks)
  • internal/rpc/serializer.go (2 hunks)
  • internal/storage/clickhouse.go (4 hunks)
  • internal/validation/cursor.go (1 hunks)
  • internal/validation/db.go (1 hunks)
  • internal/validation/duplicates.go (1 hunks)
  • internal/validation/gaps.go (1 hunks)
  • internal/validation/validation.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • cmd/root.go
  • internal/storage/clickhouse.go
  • internal/rpc/serializer.go
  • internal/common/trace.go
  • internal/validation/cursor.go
  • internal/validation/gaps.go
  • go.mod
  • internal/validation/db.go
🧰 Additional context used
🪛 ast-grep (0.38.1)
cmd/validate.go

[warning] 71-71: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

🪛 golangci-lint (1.64.8)
internal/validation/duplicates.go

33-33: Error return value is not checked

(errcheck)


44-44: Error return value is not checked

(errcheck)


55-55: Error return value is not checked

(errcheck)

@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch from 31e733b to 0433596 Compare June 2, 2025 10:54
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (8)
cmd/validate.go (2)

69-87: Set TLS MinVersion for better security

The TLS configuration is missing a MinVersion setting, which could allow connections with insecure TLS protocol versions. This was previously identified and should be addressed by adding MinVersion: tls.VersionTLS13 to enhance connection security.

🧰 Tools
🪛 ast-grep (0.38.1)

[warning] 71-71: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)


115-133: Improve error handling in validateAndFixRange

The function uses log.Fatal() which terminates the program immediately without allowing cleanup or graceful shutdown. As previously suggested, consider returning errors instead and handling them at a higher level to allow for proper cleanup and graceful error handling.

internal/validation/validation.go (2)

287-294: *Handle nil big.Int inputs to uint256.FromBig to prevent panics

The fields tx.MaxPriorityFeePerGas, tx.MaxFeePerBlobGas, tx.Value, tx.R, tx.S, etc. may be nil for certain transaction types. Calling uint256.FromBig(nil) will panic. A helper function safeUint256 should be introduced to handle nil inputs safely as previously suggested.


332-338: *Handle nil big.Int inputs to uint256.FromBig to prevent panics

Same issue as lines 287-294 - the conversion to uint256 needs null-safety to prevent runtime panics when big.Int fields are nil.

internal/validation/duplicates.go (4)

32-34: Error from removeDuplicateBlocks is silently ignored

The call to removeDuplicateBlocks does not check the returned error.
If the deduplication query fails you will continue as if the data were clean, potentially cascading bad state into gap-filling and validation steps.

-        removeDuplicateBlocks(conn, chainId, duplicateBlockNumbers)
+        if err := removeDuplicateBlocks(conn, chainId, duplicateBlockNumbers); err != nil {
+            return fmt.Errorf("failed to remove duplicate blocks: %w", err)
+        }
🧰 Tools
🪛 golangci-lint (1.64.8)

33-33: Error return value is not checked

(errcheck)


43-45: Error from removeDuplicateTransactions is silently ignored

Same issue as above for transaction duplicates. Propagate or at least log the failure so that the caller can abort or retry.

-        removeDuplicateTransactions(conn, chainId, duplicateTransactions)
+        if err := removeDuplicateTransactions(conn, chainId, duplicateTransactions); err != nil {
+            return fmt.Errorf("failed to remove duplicate transactions: %w", err)
+        }
🧰 Tools
🪛 golangci-lint (1.64.8)

44-44: Error return value is not checked

(errcheck)


54-56: Error from removeDuplicateLogs is silently ignored

Please handle the error to avoid continuing with an inconsistent database state.

-        removeDuplicateLogs(conn, chainId, duplicateLogs)
+        if err := removeDuplicateLogs(conn, chainId, duplicateLogs); err != nil {
+            return fmt.Errorf("failed to remove duplicate logs: %w", err)
+        }
🧰 Tools
🪛 golangci-lint (1.64.8)

55-55: Error return value is not checked

(errcheck)


218-230: String-built tuple list risks gigantic queries & SQL-injection

fmt.Sprintf concatenates every (txHash, logIndex) tuple straight into the SQL.
Drawbacks:

  1. Large duplicate batches can blow ClickHouse’s max query length.
  2. Although data originates from the DB, handcrafted SQL bypasses driver-level sanitisation.

Prefer parameterised arrays or smaller, parameterised batches, e.g.:

- WHERE chain_id = ? AND block_number IN (?) AND (transaction_hash, log_index) IN (%s)
+ WHERE chain_id = ? AND block_number IN (?) AND (transaction_hash, log_index) IN (?)

Then pass []tuple(String, UInt64) as the third argument, or fall back to the existing 1 000-row batching but build the tuples slice as a proper driver parameter instead of raw SQL.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 31e733b and 0433596.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (12)
  • cmd/root.go (1 hunks)
  • cmd/validate.go (1 hunks)
  • go.mod (5 hunks)
  • internal/common/trace.go (3 hunks)
  • internal/rpc/rpc.go (1 hunks)
  • internal/rpc/serializer.go (2 hunks)
  • internal/storage/clickhouse.go (4 hunks)
  • internal/validation/cursor.go (1 hunks)
  • internal/validation/db.go (1 hunks)
  • internal/validation/duplicates.go (1 hunks)
  • internal/validation/gaps.go (1 hunks)
  • internal/validation/validation.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • cmd/root.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • internal/validation/gaps.go
  • internal/common/trace.go
  • internal/validation/cursor.go
  • internal/rpc/serializer.go
  • internal/storage/clickhouse.go
  • go.mod
🧰 Additional context used
🧬 Code Graph Analysis (1)
internal/validation/db.go (1)
internal/storage/clickhouse.go (1)
  • ZERO_BYTES_42 (35-35)
🪛 ast-grep (0.38.1)
cmd/validate.go

[warning] 71-71: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

🪛 golangci-lint (1.64.8)
internal/validation/duplicates.go

33-33: Error return value is not checked

(errcheck)


44-44: Error return value is not checked

(errcheck)


55-55: Error return value is not checked

(errcheck)

🔇 Additional comments (8)
internal/rpc/rpc.go (1)

111-114: LGTM! Improves consistency between initialization functions.

The addition of chain ID initialization to InitializeSimpleRPCWithUrl ensures consistency with the main Initialize() function and provides the necessary chain ID context for validation workflows.

cmd/validate.go (1)

29-51: LGTM! Well-structured argument parsing and validation.

The command setup with batch size configuration and argument validation is well implemented. The inclusive range adjustment (line 51) is a nice touch for user-friendly batch processing.

internal/validation/validation.go (3)

164-165: LGTM! Improved contract creation handling.

The contract creation detection using isContractCreation := tx.ToAddress == "" and subsequent conditional address handling properly addresses the previous feedback about nil To addresses in contract creation transactions.


118-151: LGTM! Robust logs bloom calculation.

The logs bloom calculation correctly converts internal log structures to go-ethereum types and handles optional topics properly. The use of go-ethereum's CreateBloom ensures correctness.


56-57:

✅ Verification successful

Verify poller.Poll method signature and error handling

The call to poller.Poll(context.Background(), batch) needs verification to ensure it matches the expected signature and properly handles returned errors.


🏁 Script executed:

#!/bin/bash
# Check the Poll method signature in the orchestrator package
ast-grep --pattern 'func $_($$$) Poll($$$) $$$'

Length of output: 49


🏁 Script executed:

#!/bin/bash
# Locate any Go files in the orchestrator package
pkg_files=$(rg -l "^package orchestrator" -n)

# Print out where NewBoundlessPoller is defined
echo "🔍 Searching for NewBoundlessPoller definitions..."
rg -n "func NewBoundlessPoller" -n $pkg_files

# Search for the BoundlessPoller type
echo "🔍 Locating type BoundlessPoller..."
file=$(rg -l "^type BoundlessPoller" -n $pkg_files)
echo "Found BoundlessPoller in: $file"

# Inspect the Poll method signature on BoundlessPoller
echo "🔍 Inspecting Poll method signature..."
rg -n "^func .*BoundlessPoller\\) Poll" -n "$file"

Length of output: 1605


🏁 Script executed:

#!/bin/bash
# Locate the Poller type definition
echo "🔍 Searching for the Poller type in poller.go..."
rg -n "^type Poller" internal/orchestrator/poller.go

# Inspect the Poll method signature on Poller
echo "🔍 Inspecting the Poll method signature..."
rg -n "^func .*Poll" internal/orchestrator/poller.go

Length of output: 1141


Signature verified: Poll returns only a block number, no error to handle

  • internal/orchestrator/poller.go:149 defines
    func (p *Poller) Poll(ctx context.Context, blockNumbers []*big.Int) (lastPolledBlock *big.Int)
  • There is no error return, so no error handling is required here.
  • If you need the returned lastPolledBlock, assign it; otherwise discarding the value is acceptable.
internal/validation/db.go (3)

64-153: LGTM! Excellent concurrent data fetching design.

The concurrent fetching of blocks, logs, and transactions using goroutines and channels is well-implemented. The error handling, result organization by block number, and final assembly into BlockData structs demonstrates good engineering practices.


257-259: LGTM! Proper handling of contract creation addresses.

The special case handling for ZERO_BYTES_42 correctly converts the zero-byte representation to an empty string for contract creation transactions, which aligns with the constant defined in internal/storage/clickhouse.go.


156-165: LGTM! Optimized ClickHouse queries.

The use of FINAL clause in the SELECT queries ensures data consistency when reading from ClickHouse tables, which is appropriate for validation operations that need the most recent, deduplicated data.

@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch from 0433596 to 4b567ef Compare June 3, 2025 16:51
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (6)
cmd/validate.go (2)

69-87: TLS configuration missing MinVersion setting.

This security issue has already been identified in previous reviews. The TLS configuration should specify a minimum version for enhanced security.

🧰 Tools
🪛 ast-grep (0.38.1)

[warning] 71-71: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)


115-134: Error handling uses log.Fatal() preventing graceful shutdown.

This error handling issue has already been identified in previous reviews. The function should return errors instead of calling log.Fatal() to allow proper cleanup and graceful error handling.

internal/validation/validation.go (1)

286-296: *Handle nil big.Int inputs to prevent panics.

This nil pointer risk with uint256.FromBig() has already been identified in previous reviews. Fields like MaxPriorityFeePerGas, MaxFeePerBlobGas, Value, R, S can be nil and need safe handling to prevent runtime panics.

Also applies to: 332-340

internal/validation/duplicates.go (3)

33-34: Error return values not handled.

The calls to removeDuplicateBlocks, removeDuplicateTransactions, and removeDuplicateLogs return errors that are not being checked, as already identified in previous reviews. These should be handled to prevent silent failures.

Also applies to: 44-45, 55-56

🧰 Tools
🪛 golangci-lint (1.64.8)

33-33: Error return value is not checked

(errcheck)


62-67: HAVING clause may miss certain duplicate scenarios.

The HAVING sum(sign) != 1 condition has already been identified as potentially problematic in previous reviews, as it could miss cases where duplicates exist but sum to 1.


218-232: SQL injection risk with string interpolation.

The use of fmt.Sprintf to build the IN clause has already been identified as a security risk in previous reviews. This should use parameterized queries instead.

🧹 Nitpick comments (1)
internal/validation/validation.go (1)

97-103: Consider removing or refining the transaction type check.

The current check skips validation for transaction types > 4, but this hardcoded limit may become outdated as new transaction types are introduced. Consider either:

  • Making this configurable
  • Adding a more explicit comment about which types are supported
  • Implementing a more robust unsupported type detection
- for _, tx := range blockData.Transactions {
-     if tx.TransactionType > 4 { // Currently supported types are 0-4
-         log.Warn().Msgf("Skipping transaction root validation for block %s due to unsupported transaction type %d", blockData.Block.Number, tx.TransactionType)
-         return nil
-     }
- }
+ supportedTypes := map[uint8]bool{0: true, 1: true, 2: true, 3: true, 4: true}
+ for _, tx := range blockData.Transactions {
+     if !supportedTypes[tx.TransactionType] {
+         log.Warn().Msgf("Skipping transaction root validation for block %s due to unsupported transaction type %d", blockData.Block.Number, tx.TransactionType)
+         return nil
+     }
+ }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0433596 and 4b567ef.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (13)
  • cmd/root.go (1 hunks)
  • cmd/validate.go (1 hunks)
  • go.mod (5 hunks)
  • internal/common/abi.go (2 hunks)
  • internal/common/trace.go (3 hunks)
  • internal/rpc/rpc.go (1 hunks)
  • internal/rpc/serializer.go (2 hunks)
  • internal/storage/clickhouse.go (4 hunks)
  • internal/validation/cursor.go (1 hunks)
  • internal/validation/db.go (1 hunks)
  • internal/validation/duplicates.go (1 hunks)
  • internal/validation/gaps.go (1 hunks)
  • internal/validation/validation.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • internal/common/abi.go
🚧 Files skipped from review as they are similar to previous changes (9)
  • cmd/root.go
  • internal/rpc/rpc.go
  • internal/storage/clickhouse.go
  • internal/rpc/serializer.go
  • internal/validation/cursor.go
  • internal/validation/gaps.go
  • internal/common/trace.go
  • go.mod
  • internal/validation/db.go
🧰 Additional context used
🪛 golangci-lint (1.64.8)
internal/validation/duplicates.go

33-33: Error return value is not checked

(errcheck)


44-44: Error return value is not checked

(errcheck)


55-55: Error return value is not checked

(errcheck)

🪛 ast-grep (0.38.1)
cmd/validate.go

[warning] 71-71: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch from 4b567ef to 734272c Compare June 4, 2025 09:54
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (9)
cmd/validate.go (2)

69-87: ⚠️ Potential issue

Set TLS MinVersion for enhanced security.

The TLS configuration lacks a MinVersion setting, potentially allowing insecure TLS protocol versions. This was flagged in previous reviews and by static analysis.

conn, err := clickhouse.Open(&clickhouse.Options{
	Addr:     []string{fmt.Sprintf("%s:%d", config.Cfg.Storage.Main.Clickhouse.Host, config.Cfg.Storage.Main.Clickhouse.Port)},
	Protocol: clickhouse.Native,
-	TLS:      &tls.Config{},
+	TLS: &tls.Config{
+		MinVersion: tls.VersionTLS12,
+	},
	Auth: clickhouse.Auth{
		Username: config.Cfg.Storage.Main.Clickhouse.Username,
		Password: config.Cfg.Storage.Main.Clickhouse.Password,
	},
🧰 Tools
🪛 ast-grep (0.38.1)

[warning] 71-71: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)


115-133: 🛠️ Refactor suggestion

Replace log.Fatal() with proper error handling.

The function uses log.Fatal() which terminates the program immediately, preventing graceful error handling and cleanup. This was flagged in previous reviews.

-func validateAndFixRange(rpcClient rpc.IRPCClient, s storage.IStorage, conn clickhouse.Conn, startBlock *big.Int, endBlock *big.Int, fixBatchSize int) error {
+func validateAndFixRange(rpcClient rpc.IRPCClient, s storage.IStorage, conn clickhouse.Conn, startBlock *big.Int, endBlock *big.Int, fixBatchSize int) error {
	chainId := rpcClient.GetChainID()
	err := validation.FindAndRemoveDuplicates(conn, chainId, startBlock, endBlock)
	if err != nil {
-		log.Fatal().Err(err).Msg("Failed to find and fix duplicates")
+		return fmt.Errorf("failed to find and fix duplicates: %w", err)
	}

	err = validation.FindAndFixGaps(rpcClient, s, conn, chainId, startBlock, endBlock)
	if err != nil {
-		log.Fatal().Err(err).Msg("Failed to find and fix gaps")
+		return fmt.Errorf("failed to find and fix gaps: %w", err)
	}

	err = validation.ValidateAndFixBlocks(rpcClient, s, conn, startBlock, endBlock, fixBatchSize)
	if err != nil {
-		log.Fatal().Err(err).Msg("Failed to validate and fix blocks")
+		return fmt.Errorf("failed to validate and fix blocks: %w", err)
	}

	log.Debug().Msgf("Validation complete for range %v-%v", startBlock, endBlock)
	return nil
}
internal/validation/validation.go (3)

288-296: ⚠️ Potential issue

*Prevent panics from nil big.Int inputs to uint256.FromBig.

Fields like tx.MaxPriorityFeePerGas, tx.MaxFeePerBlobGas, tx.Value, etc. may be nil for certain transaction types, causing uint256.FromBig(nil) to panic.

Add a helper function and use it for safe conversions:

+func safeUint256(b *big.Int) *uint256.Int {
+	if b == nil {
+		return new(uint256.Int)
+	}
+	out, _ := uint256.FromBig(b)
+	return out
+}

Then replace the direct calls:

-		chainID, _ := uint256.FromBig(tx.ChainId)
-		gasTipCap, _ := uint256.FromBig(tx.MaxPriorityFeePerGas)
-		gasFeeCap, _ := uint256.FromBig(tx.MaxFeePerGas)
-		value, _ := uint256.FromBig(tx.Value)
-		blobFeeCap, _ := uint256.FromBig(tx.MaxFeePerBlobGas)
-		v, _ := uint256.FromBig(tx.V)
-		r, _ := uint256.FromBig(tx.R)
-		s, _ := uint256.FromBig(tx.S)
+		chainID := safeUint256(tx.ChainId)
+		gasTipCap := safeUint256(tx.MaxPriorityFeePerGas)
+		gasFeeCap := safeUint256(tx.MaxFeePerGas)
+		value := safeUint256(tx.Value)
+		blobFeeCap := safeUint256(tx.MaxFeePerBlobGas)
+		v := safeUint256(tx.V)
+		r := safeUint256(tx.R)
+		s := safeUint256(tx.S)

333-340: ⚠️ Potential issue

Apply the same nil-safety fix for EIP-7702 transactions.

The same nil pointer risk exists here for the EIP-7702 transaction type conversion.

-		chainID, _ := uint256.FromBig(tx.ChainId)
-		gasTipCap, _ := uint256.FromBig(tx.MaxPriorityFeePerGas)
-		gasFeeCap, _ := uint256.FromBig(tx.MaxFeePerGas)
-		value, _ := uint256.FromBig(tx.Value)
-		v, _ := uint256.FromBig(tx.V)
-		r, _ := uint256.FromBig(tx.R)
-		s, _ := uint256.FromBig(tx.S)
+		chainID := safeUint256(tx.ChainId)
+		gasTipCap := safeUint256(tx.MaxPriorityFeePerGas)
+		gasFeeCap := safeUint256(tx.MaxFeePerGas)
+		value := safeUint256(tx.Value)
+		v := safeUint256(tx.V)
+		r := safeUint256(tx.R)
+		s := safeUint256(tx.S)

56-58: ⚠️ Potential issue

Handle error return from poller.Poll.

The error returned by poller.Poll is not checked, which could silently ignore polling failures for invalid blocks.

poller := orchestrator.NewBoundlessPoller(rpcClient, s)
-poller.Poll(context.Background(), batch)
+if err := poller.Poll(context.Background(), batch); err != nil {
+	log.Error().Err(err).Msgf("Failed to poll batch of blocks %s-%s", batch[0], batch[len(batch)-1])
+	return fmt.Errorf("failed to poll invalid blocks: %w", err)
+}
internal/validation/duplicates.go (4)

33-34: ⚠️ Potential issue

Handle error return from removeDuplicateBlocks.

The error returned by removeDuplicateBlocks is not checked, potentially ignoring failures in duplicate removal.

log.Debug().Msgf("Found %d duplicate blocks in range %v-%v: %v", len(duplicateBlockNumbers), startBlock, endBlock, duplicateBlockNumbers)
-removeDuplicateBlocks(conn, chainId, duplicateBlockNumbers)
+if err := removeDuplicateBlocks(conn, chainId, duplicateBlockNumbers); err != nil {
+	return fmt.Errorf("failed to remove duplicate blocks: %w", err)
+}
🧰 Tools
🪛 golangci-lint (1.64.8)

33-33: Error return value is not checked

(errcheck)


44-45: ⚠️ Potential issue

Handle error return from removeDuplicateTransactions.

The error returned by removeDuplicateTransactions is not checked, potentially ignoring failures in duplicate removal.

log.Debug().Msgf("Found %d duplicate transactions in range %v-%v: %v", len(duplicateTransactions), startBlock, endBlock, duplicateTransactions)
-removeDuplicateTransactions(conn, chainId, duplicateTransactions)
+if err := removeDuplicateTransactions(conn, chainId, duplicateTransactions); err != nil {
+	return fmt.Errorf("failed to remove duplicate transactions: %w", err)
+}
🧰 Tools
🪛 golangci-lint (1.64.8)

44-44: Error return value is not checked

(errcheck)


55-56: ⚠️ Potential issue

Handle error return from removeDuplicateLogs.

The error returned by removeDuplicateLogs is not checked, potentially ignoring failures in duplicate removal.

log.Debug().Msgf("Found %d duplicate logs in range %v-%v: %v", len(duplicateLogs), startBlock, endBlock, duplicateLogs)
-removeDuplicateLogs(conn, chainId, duplicateLogs)
+if err := removeDuplicateLogs(conn, chainId, duplicateLogs); err != nil {
+	return fmt.Errorf("failed to remove duplicate logs: %w", err)
+}
🧰 Tools
🪛 golangci-lint (1.64.8)

55-55: Error return value is not checked

(errcheck)


218-229: 🛠️ Refactor suggestion

Avoid string interpolation in SQL queries for security.

The SQL query uses fmt.Sprintf to embed tuples directly, which poses security risks and could create oversized queries.

Consider using parameterized queries or batching with smaller queries. If ClickHouse supports tuple arrays as parameters, use those instead:

-		query := fmt.Sprintf(`WITH
-			to_be_inserted AS (
-				SELECT chain_id, block_number, block_hash, block_timestamp, transaction_hash, transaction_index, log_index, address,
-					data, topic_0, topic_1, topic_2, topic_3, insert_timestamp, -sign as sign
-				FROM default.logs FINAL
-				WHERE chain_id = ? AND block_number IN (?) AND (transaction_hash, log_index) IN (%s)
-			)
-			INSERT INTO logs (
-				chain_id, block_number, block_hash, block_timestamp, transaction_hash, transaction_index, log_index, address,
-				data, topic_0, topic_1, topic_2, topic_3, insert_timestamp, sign
-			) SELECT * from to_be_inserted
-		`, strings.Join(tuples, ","))
+		// Alternative approach: process each log individually or use smaller batches
+		for _, duplicateLog := range batch {
+			query := `WITH
+				to_be_inserted AS (
+					SELECT chain_id, block_number, block_hash, block_timestamp, transaction_hash, transaction_index, log_index, address,
+						data, topic_0, topic_1, topic_2, topic_3, insert_timestamp, -sign as sign
+					FROM default.logs FINAL
+					WHERE chain_id = ? AND block_number = ? AND transaction_hash = ? AND log_index = ?
+				)
+				INSERT INTO logs (
+					chain_id, block_number, block_hash, block_timestamp, transaction_hash, transaction_index, log_index, address,
+					data, topic_0, topic_1, topic_2, topic_3, insert_timestamp, sign
+				) SELECT * from to_be_inserted`
+			
+			err := conn.Exec(context.Background(), query, chainId, duplicateLog.BlockNumber, duplicateLog.TxHash, duplicateLog.LogIndex)
+			if err != nil {
+				return err
+			}
+		}
🧹 Nitpick comments (1)
cmd/validate.go (1)

19-26: Complete the command descriptions.

The Short and Long descriptions are marked as "TBD". Please provide meaningful descriptions for the validate command to improve CLI usability.

var (
	validateCmd = &cobra.Command{
		Use:   "validate",
-		Short: "TBD",
-		Long:  "TBD",
+		Short: "Validate and fix blockchain data integrity",
+		Long:  `Validate and fix blockchain data integrity by checking for duplicates, gaps, and block validation errors. Supports custom batch sizes for processing.`,
		Run: func(cmd *cobra.Command, args []string) {
			RunValidate(cmd, args)
		},
	}
)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b567ef and 734272c.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (13)
  • cmd/root.go (1 hunks)
  • cmd/validate.go (1 hunks)
  • go.mod (5 hunks)
  • internal/common/abi.go (2 hunks)
  • internal/common/trace.go (3 hunks)
  • internal/rpc/rpc.go (1 hunks)
  • internal/rpc/serializer.go (2 hunks)
  • internal/storage/clickhouse.go (4 hunks)
  • internal/validation/cursor.go (1 hunks)
  • internal/validation/db.go (1 hunks)
  • internal/validation/duplicates.go (1 hunks)
  • internal/validation/gaps.go (1 hunks)
  • internal/validation/validation.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • internal/common/abi.go
🚧 Files skipped from review as they are similar to previous changes (9)
  • internal/rpc/rpc.go
  • cmd/root.go
  • internal/rpc/serializer.go
  • internal/validation/gaps.go
  • internal/validation/cursor.go
  • internal/common/trace.go
  • internal/storage/clickhouse.go
  • go.mod
  • internal/validation/db.go
🧰 Additional context used
🪛 ast-grep (0.38.1)
cmd/validate.go

[warning] 71-71: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

🪛 golangci-lint (1.64.8)
internal/validation/duplicates.go

33-33: Error return value is not checked

(errcheck)


44-44: Error return value is not checked

(errcheck)


55-55: Error return value is not checked

(errcheck)

@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch from 734272c to b5f0fec Compare July 21, 2025 21:03
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (6)
cmd/validate.go (2)

69-87: Set TLS MinVersion for better security.

The TLS configuration lacks a MinVersion setting, potentially allowing insecure TLS protocol versions.

TLS: &tls.Config{
+   MinVersion: tls.VersionTLS12,
},

115-134: Improve error handling by returning errors instead of using log.Fatal().

The function uses log.Fatal() which terminates the program immediately without cleanup. Consider returning errors for better error handling.

func validateAndFixRange(rpcClient rpc.IRPCClient, s storage.IStorage, conn clickhouse.Conn, startBlock *big.Int, endBlock *big.Int, fixBatchSize int) error {
    chainId := rpcClient.GetChainID()
    err := validation.FindAndRemoveDuplicates(conn, chainId, startBlock, endBlock)
    if err != nil {
-       log.Fatal().Err(err).Msg("Failed to find and fix duplicates")
+       return fmt.Errorf("failed to find and fix duplicates: %w", err)
    }

    err = validation.FindAndFixGaps(rpcClient, s, conn, chainId, startBlock, endBlock)
    if err != nil {
-       log.Fatal().Err(err).Msg("Failed to find and fix gaps")
+       return fmt.Errorf("failed to find and fix gaps: %w", err)
    }

    err = validation.ValidateAndFixBlocks(rpcClient, s, conn, startBlock, endBlock, fixBatchSize)
    if err != nil {
-       log.Fatal().Err(err).Msg("Failed to validate and fix blocks")
+       return fmt.Errorf("failed to validate and fix blocks: %w", err)
    }

    log.Debug().Msgf("Validation complete for range %v-%v", startBlock, endBlock)
    return nil
}
internal/validation/validation.go (4)

56-58: Fix the poller.Poll call signature.

Based on past review comments, the poller.Poll(batch) call appears to have incorrect arity and is missing required arguments.

poller := orchestrator.NewBoundlessPoller(rpcClient, s)
-poller.Poll(context.Background(), batch)
+if err := poller.Poll(context.Background(), batch); err != nil {
+    return fmt.Errorf("polling failed for blocks %s-%s: %w", batch[0], batch[len(batch)-1], err)
+}

164-166: Handle contract creation transactions correctly.

Based on past review feedback, the current implementation sets 0x0000000000000000000000000000000000000000 for contract creation transactions instead of properly handling the nil To field.

-// Convert to address
-to := common.HexToAddress(tx.ToAddress)
-isContractCreation := tx.ToAddress == ""
+// Handle contract creation (nil To address)
+var toPtr *common.Address
+isContractCreation := tx.ToAddress == ""
+if !isContractCreation {
+    addr := common.HexToAddress(tx.ToAddress)
+    toPtr = &addr
+}

Then update the transaction creation logic to use toPtr instead of to for non-legacy transactions.


288-296: Add nil checks for big.Int to uint256 conversions to prevent panics.

The uint256.FromBig() function will panic if passed a nil pointer. Fields like MaxPriorityFeePerGas, MaxFeePerBlobGas, Value, R, S, etc. can be NULL in the database.

+func safeUint256(b *big.Int) *uint256.Int {
+    if b == nil {
+        return new(uint256.Int)
+    }
+    out, _ := uint256.FromBig(b)
+    return out
+}

Then replace the direct calls:

-chainID, _ := uint256.FromBig(tx.ChainId)
-gasTipCap, _ := uint256.FromBig(tx.MaxPriorityFeePerGas)
-gasFeeCap, _ := uint256.FromBig(tx.MaxFeePerGas)
-value, _ := uint256.FromBig(tx.Value)
-blobFeeCap, _ := uint256.FromBig(tx.MaxFeePerBlobGas)
-v, _ := uint256.FromBig(tx.V)
-r, _ := uint256.FromBig(tx.R)
-s, _ := uint256.FromBig(tx.S)
+chainID := safeUint256(tx.ChainId)
+gasTipCap := safeUint256(tx.MaxPriorityFeePerGas)
+gasFeeCap := safeUint256(tx.MaxFeePerGas)
+value := safeUint256(tx.Value)
+blobFeeCap := safeUint256(tx.MaxFeePerBlobGas)
+v := safeUint256(tx.V)
+r := safeUint256(tx.R)
+s := safeUint256(tx.S)

334-340: Apply the same nil check fix for EIP-7702 transaction uint256 conversions.

Similar to the blob transaction case, these conversions also need protection against nil pointers.

-chainID, _ := uint256.FromBig(tx.ChainId)
-gasTipCap, _ := uint256.FromBig(tx.MaxPriorityFeePerGas)
-gasFeeCap, _ := uint256.FromBig(tx.MaxFeePerGas)
-value, _ := uint256.FromBig(tx.Value)
-v, _ := uint256.FromBig(tx.V)
-r, _ := uint256.FromBig(tx.R)
-s, _ := uint256.FromBig(tx.S)
+chainID := safeUint256(tx.ChainId)
+gasTipCap := safeUint256(tx.MaxPriorityFeePerGas)
+gasFeeCap := safeUint256(tx.MaxFeePerGas)
+value := safeUint256(tx.Value)
+v := safeUint256(tx.V)
+r := safeUint256(tx.R)
+s := safeUint256(tx.S)
🧹 Nitpick comments (2)
cmd/validate.go (1)

18-27: Add description for the validate command.

The command currently has placeholder "TBD" values for Short and Long descriptions. Consider adding meaningful descriptions to help users understand the command's purpose.

validateCmd = &cobra.Command{
    Use:   "validate",
-   Short: "TBD",
-   Long:  "TBD",
+   Short: "Validate and repair blockchain data integrity",
+   Long:  `Validate and repair blockchain data integrity by checking for duplicates, gaps, and block validation errors. Processes data in configurable batches.`,
    Run: func(cmd *cobra.Command, args []string) {
        RunValidate(cmd, args)
    },
}
internal/validation/validation.go (1)

97-103: Consider making unsupported transaction type handling configurable.

The current implementation skips transaction root validation entirely when encountering unsupported transaction types. This might mask validation issues in blocks with mixed transaction types.

Consider adding a flag to control this behavior or validate only the supported transactions:

// TODO: remove this once we know how to validate all tx types
+unsupportedTxCount := 0
for _, tx := range blockData.Transactions {
    if tx.TransactionType > 4 { // Currently supported types are 0-4
-       log.Warn().Msgf("Skipping transaction root validation for block %s due to unsupported transaction type %d", blockData.Block.Number, tx.TransactionType)
-       return nil
+       unsupportedTxCount++
    }
}
+if unsupportedTxCount > 0 {
+    log.Warn().Msgf("Block %s contains %d unsupported transaction types, skipping transaction root validation", blockData.Block.Number, unsupportedTxCount)
+    return nil
+}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 734272c and b5f0fec.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (13)
  • cmd/root.go (1 hunks)
  • cmd/validate.go (1 hunks)
  • go.mod (6 hunks)
  • internal/common/abi.go (2 hunks)
  • internal/common/trace.go (3 hunks)
  • internal/rpc/rpc.go (1 hunks)
  • internal/rpc/serializer.go (2 hunks)
  • internal/storage/clickhouse.go (4 hunks)
  • internal/validation/cursor.go (1 hunks)
  • internal/validation/db.go (1 hunks)
  • internal/validation/duplicates.go (1 hunks)
  • internal/validation/gaps.go (1 hunks)
  • internal/validation/validation.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • internal/common/abi.go
🚧 Files skipped from review as they are similar to previous changes (10)
  • internal/rpc/rpc.go
  • cmd/root.go
  • internal/validation/cursor.go
  • internal/rpc/serializer.go
  • internal/common/trace.go
  • internal/storage/clickhouse.go
  • go.mod
  • internal/validation/gaps.go
  • internal/validation/db.go
  • internal/validation/duplicates.go
🧰 Additional context used
🧠 Learnings (1)
internal/validation/validation.go (1)

Learnt from: iuwqyir
PR: #216
File: internal/orchestrator/chain_tracker.go:45-53
Timestamp: 2025-06-04T11:28:38.057Z
Learning: Go's math/big.Int type has a Float64() method that returns (float64, bool), where the boolean indicates if the conversion was exact.

🪛 ast-grep (0.38.6)
cmd/validate.go

[warning] 71-71: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch from b5f0fec to ebe265c Compare July 22, 2025 15:59
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (4)
cmd/validate.go (1)

31-76: Consider improving error handling instead of using log.Fatal().

The function uses log.Fatal() which terminates the program immediately without cleanup. This issue was previously raised for similar code.

internal/validation/duplicates.go (3)

71-76: Consider using count() > 1 for more robust duplicate detection.

As previously noted, HAVING sum(sign) != 1 may miss duplicate scenarios where rows have signs like (+1, +1, -1) that sum to 1. Using count() > 1 would be more reliable for detecting any group with multiple live rows.

-HAVING sum(sign) != 1
+HAVING count() > 1

97-101: Apply consistent duplicate detection logic across all find functions.

The same duplicate detection issue applies to transactions and logs queries. Consider using count() > 1 consistently across all three functions.

Also applies to: 123-127


227-241: String-built IN clause still poses risks for large queries and bypasses sanitization.

The use of fmt.Sprintf to embed tuples directly into SQL remains problematic:

  1. Large duplicate sets can exceed server query length limits
  2. String concatenation bypasses ClickHouse's query parameter sanitization

Consider using parameterized queries with array inputs or batching with smaller, fully parameterized queries.

Does ClickHouse support array of tuples as query parameters for IN clauses?
🧹 Nitpick comments (5)
internal/storage/connector.go (1)

106-109: Enhance the method documentation for clarity.

The comment "Gets only the data required for validation" is vague. Consider specifying what subset of data is retrieved (e.g., blocks with transactions and logs, but without traces).

 /**
- * Gets only the data required for validation.
+ * Gets block data required for validation, including blocks, transactions, and logs
+ * for the specified block range. Traces are excluded as they are not needed for validation.
  */
internal/validation/validator.go (4)

29-34: Use idiomatic Go documentation format.

Replace the JSDoc-style comment with Go's standard documentation format.

-/**
- * Validate blocks in the range of startBlock to endBlock
- * @param startBlock - The start block number (inclusive)
- * @param endBlock - The end block number (inclusive)
- * @return error - An error if the validation fails
- */
+// ValidateBlockRange validates blocks in the range of startBlock to endBlock (inclusive).
+// It returns slices of valid and invalid blocks, and an error if the validation fails.

40-45: Remove redundant return statement.

The explicit return at line 44 is unnecessary since the same values are already being returned.

 validBlocks, invalidBlocks, err = v.ValidateBlocks(dbData)
 if err != nil {
     return nil, nil, err
 }
-return validBlocks, invalidBlocks, nil
+return validBlocks, invalidBlocks, err

48-49: Simplify slice initialization.

Use nil initialization for empty slices instead of make with 0 capacity.

-invalidBlocks = make([]common.BlockData, 0)
-validBlocks = make([]common.BlockData, 0)
+var invalidBlocks []common.BlockData
+var validBlocks []common.BlockData

78-84: Track the TODO for unsupported transaction types.

The TODO comment indicates that transaction root validation is skipped for certain transaction types. This should be tracked to ensure complete validation coverage.

Would you like me to create an issue to track the implementation of validation for all transaction types?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5f0fec and ebe265c.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (16)
  • cmd/root.go (1 hunks)
  • cmd/validate.go (1 hunks)
  • cmd/validate_and_fix.go (1 hunks)
  • go.mod (6 hunks)
  • internal/common/abi.go (2 hunks)
  • internal/common/trace.go (3 hunks)
  • internal/rpc/rpc.go (1 hunks)
  • internal/rpc/serializer.go (2 hunks)
  • internal/storage/clickhouse.go (5 hunks)
  • internal/storage/connector.go (1 hunks)
  • internal/validation/cursor.go (1 hunks)
  • internal/validation/duplicates.go (1 hunks)
  • internal/validation/gaps.go (1 hunks)
  • internal/validation/root_calculator.go (1 hunks)
  • internal/validation/validator.go (1 hunks)
  • test/mocks/MockIMainStorage.go (1 hunks)
🧬 Code Graph Analysis (3)
internal/storage/connector.go (1)
internal/common/block.go (1)
  • BlockData (61-66)
internal/validation/root_calculator.go (2)
internal/common/transaction.go (1)
  • Transaction (16-51)
internal/common/log.go (1)
  • Log (15-31)
test/mocks/MockIMainStorage.go (1)
internal/common/block.go (1)
  • BlockData (61-66)
✅ Files skipped from review due to trivial changes (1)
  • internal/common/abi.go
🚧 Files skipped from review as they are similar to previous changes (8)
  • internal/rpc/rpc.go
  • cmd/root.go
  • internal/rpc/serializer.go
  • internal/common/trace.go
  • internal/validation/cursor.go
  • internal/validation/gaps.go
  • internal/storage/clickhouse.go
  • go.mod
🧰 Additional context used
🧬 Code Graph Analysis (3)
internal/storage/connector.go (1)
internal/common/block.go (1)
  • BlockData (61-66)
internal/validation/root_calculator.go (2)
internal/common/transaction.go (1)
  • Transaction (16-51)
internal/common/log.go (1)
  • Log (15-31)
test/mocks/MockIMainStorage.go (1)
internal/common/block.go (1)
  • BlockData (61-66)
🔇 Additional comments (11)
cmd/validate_and_fix.go (2)

72-74: Good implementation of TLS security configuration.

The TLS configuration properly sets MinVersion: tls.VersionTLS12, addressing the security concern from previous reviews.


117-150: Excellent error handling in validateAndFixRange.

The function properly returns errors with descriptive context using fmt.Errorf, allowing the caller to handle them gracefully. This is a significant improvement over using log.Fatal().

internal/validation/root_calculator.go (2)

18-253: Well-structured transaction root calculation with comprehensive type support.

The implementation correctly handles all transaction types including the latest EIP-7702, properly manages signature reconstruction for legacy and EIP-155 transactions, and includes thorough error handling. The sorting by transaction index ensures deterministic root calculation.


174-178: Fix contract creation handling for blob transactions.

The To field is being set even when it's a contract creation. For blob transactions, the To field should remain empty (zero address) for contract creation.

-var toAddr gethCommon.Address
-if !isContractCreation {
-    toAddr = to
-}
+toAddr := to

Note: The BlobTx struct expects a non-pointer Address field, so for contract creation it should use the zero address value, which is what to already contains when isContractCreation is true.

Likely an incorrect or invalid review comment.

test/mocks/MockIMainStorage.go (1)

630-688: Auto-generated mock code looks correct.

The mock implementation for GetValidationBlockData follows standard mockery patterns and correctly matches the interface signature.

internal/validation/validator.go (2)

1-19: LGTM!

The imports and struct definition are well-organized and appropriate for the validation functionality.


21-27: LGTM!

The constructor properly initializes all fields with appropriate dependencies.

internal/validation/duplicates.go (4)

1-23: LGTM!

The imports and struct definitions are well-organized with appropriate field tags for JSON and ClickHouse.


24-68: Error handling has been properly implemented.

The function now correctly handles all error returns from the removal functions, addressing the previous review comments.


148-168: LGTM!

The duplicate removal approach using negated sign values is appropriate for ClickHouse's merge tree engine.


170-208: LGTM!

The batched approach with proper parameterization is well-implemented for handling large numbers of duplicate transactions.

@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch from ebe265c to 7bbd881 Compare July 22, 2025 16:57
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
cmd/validate_and_fix.go (2)

57-57: Fix typo in log message.

The log message contains a typo in the function name.

-	log.Info().Msgf("Running validationAndFix for chain %d", rpcClient.GetChainID())
+	log.Info().Msgf("Running validateAndFix for chain %d", rpcClient.GetChainID())

114-116: Use Go-style comments.

The function comment uses C-style comment syntax instead of Go conventions.

-/**
- * Validates a range of blocks (end and start are inclusive) for a given chain and fixes any problems it finds
- */
+// validateAndFixRange validates a range of blocks (end and start are inclusive) for a given chain and fixes any problems it finds
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ebe265c and 7bbd881.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (16)
  • cmd/root.go (1 hunks)
  • cmd/validate.go (1 hunks)
  • cmd/validate_and_fix.go (1 hunks)
  • go.mod (6 hunks)
  • internal/common/abi.go (2 hunks)
  • internal/common/trace.go (3 hunks)
  • internal/rpc/rpc.go (1 hunks)
  • internal/rpc/serializer.go (2 hunks)
  • internal/storage/clickhouse.go (5 hunks)
  • internal/storage/connector.go (1 hunks)
  • internal/validation/cursor.go (1 hunks)
  • internal/validation/duplicates.go (1 hunks)
  • internal/validation/gaps.go (1 hunks)
  • internal/validation/root_calculator.go (1 hunks)
  • internal/validation/validator.go (1 hunks)
  • test/mocks/MockIMainStorage.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (15)
  • internal/common/abi.go
  • internal/storage/connector.go
  • cmd/root.go
  • internal/rpc/rpc.go
  • internal/rpc/serializer.go
  • cmd/validate.go
  • internal/common/trace.go
  • internal/storage/clickhouse.go
  • internal/validation/cursor.go
  • internal/validation/gaps.go
  • internal/validation/root_calculator.go
  • go.mod
  • internal/validation/validator.go
  • test/mocks/MockIMainStorage.go
  • internal/validation/duplicates.go
🔇 Additional comments (3)
cmd/validate_and_fix.go (3)

1-16: LGTM! Clean import structure.

The package declaration and imports are well-organized and all dependencies are utilized appropriately.


18-27: LGTM! Command definition is well-structured.

The command descriptions are now meaningful and informative, addressing the previous placeholder issue. The cobra command structure follows best practices.


117-150: LGTM! Well-structured validation and fixing logic.

The function follows a clear sequence of operations with proper error handling and separation of concerns. The logic for collecting invalid blocks and fixing them in batches is well-implemented.

@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch from 7bbd881 to 1d1f463 Compare July 22, 2025 17:44
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
cmd/validate_and_fix.go (1)

70-94: Consider reusing storage abstraction for ClickHouse connection.

The ClickHouse connection is being established directly here with hardcoded configuration, which duplicates the connection logic that likely exists in the storage layer. This creates maintenance overhead and potential configuration drift.

Consider extracting a method from the storage layer to provide the ClickHouse connection, or add a method to the storage interface to get the underlying connection.

internal/storage/clickhouse.go (1)

914-920: getBlockNumbersStringArray generates invalid SQL with trailing comma.

The function unconditionally appends a comma after every element, resulting in invalid SQL syntax like block_number IN (123,456,). ClickHouse will reject the extra comma.

Apply the fix suggested in the previous review:

-func getBlockNumbersStringArray(blockNumbers []*big.Int) string {
-    blockNumbersString := ""
-    for _, blockNumber := range blockNumbers {
-        blockNumbersString += fmt.Sprintf("%s,", blockNumber.String())
-    }
-    return blockNumbersString
-}
+func getBlockNumbersStringArray(blockNumbers []*big.Int) string {
+    if len(blockNumbers) == 0 {
+        return ""
+    }
+    var b strings.Builder
+    for i, n := range blockNumbers {
+        if i > 0 {
+            b.WriteByte(',')
+        }
+        b.WriteString(n.String())
+    }
+    return b.String()
+}
internal/validation/duplicates.go (2)

71-76: HAVING sum(sign) != 1 may miss certain duplicate scenarios.

If rows exist with signs (+1, +1, -1) the sum equals +1, therefore the group is not flagged, yet you still have two live (+1) rows – a duplicate situation. Using count() > 1 (or abs(sum(sign)) != 1) is safer.

Consider using a more robust duplicate detection:

-HAVING sum(sign) != 1
+HAVING count(*) > 1 AND sum(sign) > 0

227-238: String-built IN clause risks gigantic queries & SQL injection.

fmt.Sprintf embeds a comma-separated list of tuples directly into the SQL string. This can create excessively large queries and bypasses ClickHouse query-parameter sanitization.

Consider using parameterized queries or batching with smaller, safer queries:

-WHERE chain_id = ? AND block_number IN (?) AND (transaction_hash, log_index) IN (%s)
+// Split into smaller batches and use individual parameter binding
+WHERE chain_id = ? AND block_number = ? AND transaction_hash = ? AND log_index = ?
🧹 Nitpick comments (1)
internal/orchestrator/validator.go (1)

125-127: Consider continuing with remaining batches on partial failures.

The current implementation stops processing on the first failed batch, potentially leaving many fixable blocks unprocessed. Consider collecting errors and continuing with remaining batches to maximize the number of fixed blocks.

+var allErrors []error
 for i := 0; i < len(invalidBlocks); i += fixBatchSize {
     // ... existing batch logic ...
     if len(failedBlocks) > 0 {
         log.Error().Msgf("Failed to poll %d blocks: %v", len(failedBlocks), failedBlocks)
-        return fmt.Errorf("failed to poll %d blocks: %v", len(failedBlocks), failedBlocks)
+        allErrors = append(allErrors, fmt.Errorf("failed to poll %d blocks: %v", len(failedBlocks), failedBlocks))
+        continue
     }
     // ... rest of processing ...
 }
+if len(allErrors) > 0 {
+    return fmt.Errorf("encountered %d errors during fixing: %v", len(allErrors), allErrors)
+}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7bbd881 and 1d1f463.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (15)
  • cmd/root.go (1 hunks)
  • cmd/validate.go (1 hunks)
  • cmd/validate_and_fix.go (1 hunks)
  • go.mod (6 hunks)
  • internal/common/abi.go (2 hunks)
  • internal/common/trace.go (3 hunks)
  • internal/orchestrator/validator.go (1 hunks)
  • internal/rpc/rpc.go (1 hunks)
  • internal/rpc/serializer.go (2 hunks)
  • internal/storage/clickhouse.go (5 hunks)
  • internal/storage/connector.go (1 hunks)
  • internal/validation/cursor.go (1 hunks)
  • internal/validation/duplicates.go (1 hunks)
  • internal/validation/root_calculator.go (1 hunks)
  • test/mocks/MockIMainStorage.go (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • internal/common/abi.go
🚧 Files skipped from review as they are similar to previous changes (10)
  • cmd/root.go
  • internal/rpc/rpc.go
  • cmd/validate.go
  • internal/common/trace.go
  • internal/validation/cursor.go
  • internal/rpc/serializer.go
  • internal/storage/connector.go
  • go.mod
  • internal/validation/root_calculator.go
  • test/mocks/MockIMainStorage.go
🔇 Additional comments (2)
internal/storage/clickhouse.go (2)

383-384: Gas and GasUsed Type Compatibility Confirmed

The Trace struct in internal/common/trace.go now defines both Gas and GasUsed as uint64 (lines 22–23), which maps directly to ClickHouse’s UInt64 type. Removing the prior .Uint64() conversions is therefore correct—no further changes are needed here.


1893-1893: Check block_number nullability in your ClickHouse schema

The FindMissingBlockNumbers query uses:

LEFT JOIN existing_blocks e ON s.expected_block_number = e.block_number
WHERE e.block_number = 0

With non-nullable numeric columns in ClickHouse, unmatched rows default to 0, so this correctly picks gaps. However, if your blocks.block_number column is defined as Nullable(UInt256) (or any nullable type), missing rows will be NULL and won’t satisfy = 0.

Please verify your table schema:

  • If block_number is non-nullable (e.g. UInt256), no change is needed.
  • If it’s nullable, change the filter to one of:
    WHERE isNull(e.block_number)
    or
    WHERE e.block_number IS NULL

@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch from 1d1f463 to 0b45721 Compare July 22, 2025 18:34
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (6)
cmd/validate_and_fix.go (2)

70-94: Consider reusing storage abstraction for ClickHouse connection.

The ClickHouse connection is being established directly here with hardcoded configuration, which duplicates the connection logic that likely exists in the storage layer. This creates maintenance overhead and potential configuration drift.

Consider extracting a method from the storage layer to provide the ClickHouse connection, or add a method to the storage interface to get the underlying connection.


96-96: Add nil check for cursor fields before use.

The code directly accesses cursor.LastScannedBlockNumber and cursor.MaxBlockNumber without verifying they are initialized. If cursor initialization fails to set these values, this could cause a nil pointer dereference.

internal/storage/clickhouse.go (1)

483-486: createBlockNumbersClause may generate invalid SQL because of a trailing comma.

getBlockNumbersStringArray unconditionally appends a comma after every element. When the slice is non-empty, the resulting clause becomes e.g. block_number IN (123,456,) which ClickHouse will reject.

-func getBlockNumbersStringArray(blockNumbers []*big.Int) string {
-    blockNumbersString := ""
-    for _, blockNumber := range blockNumbers {
-        blockNumbersString += fmt.Sprintf("%s,", blockNumber.String())
-    }
-    return blockNumbersString
-}
+func getBlockNumbersStringArray(blockNumbers []*big.Int) string {
+    if len(blockNumbers) == 0 {
+        return ""
+    }
+    var b strings.Builder
+    for i, n := range blockNumbers {
+        if i > 0 {
+            b.WriteByte(',')
+        }
+        b.WriteString(n.String())
+    }
+    return b.String()
+}

Without this fix, every query that relies on block-number filtering (including the new validation CLI) will fail.

Also applies to: 914-920

internal/orchestrator/validator.go (1)

80-85: Consider returning a distinct validation status for unsupported transaction types.

Currently, blocks with unsupported transaction types (> 4) are marked as valid without actual validation. This could mask data integrity issues.

internal/validation/duplicates.go (2)

71-76: HAVING sum(sign) != 1 may miss certain duplicate scenarios.

If rows exist with signs (+1, +1, -1) the sum equals +1, therefore the group is not flagged, yet you still have two live (+1) rows – a duplicate situation. Using count() > 1 (or abs(sum(sign)) != 1) is safer.

Also applies to: 100-101, 126-127


227-241: String-built IN clause risks gigantic queries & SQL-injection.

fmt.Sprintf embeds a comma-separated list of tuples directly into the SQL string. This can create excessively large queries and bypasses ClickHouse query-parameter sanitization.

Prefer using parameterized queries or batching with smaller queries.

🧹 Nitpick comments (4)
cmd/validate_and_fix.go (1)

115-117: Use standard Go documentation format.

Use the standard Go documentation format with // instead of /** */.

-/**
- * Validates a range of blocks (end and start are inclusive) for a given chain and fixes any problems it finds
- */
+// validateAndFixRange validates a range of blocks (end and start are inclusive) for a given chain and fixes any problems it finds
internal/storage/clickhouse.go (1)

1867-1912: Consider clarifying the NULL check condition.

The query logic is clever, using ClickHouse's numbers() function to generate the expected sequence. However, the condition e.block_number = 0 might be confusing as it's actually checking for NULL values after the LEFT JOIN.

Consider adding a comment or using IS NULL for clarity:

WHERE 
-    e.block_number = 0
+    e.block_number IS NULL  -- No matching block found in existing_blocks
internal/orchestrator/validator.go (2)

29-34: Use standard Go documentation format.

-/**
- * Validate blocks in the range of startBlock to endBlock
- * @param startBlock - The start block number (inclusive)
- * @param endBlock - The end block number (inclusive)
- * @return error - An error if the validation fails
- */
+// ValidateBlockRange validates blocks in the range of startBlock to endBlock.
+// Both startBlock and endBlock are inclusive.

47-63: Consider collecting all validation errors instead of stopping on first error.

The current implementation stops validation on the first error, which might make debugging difficult when multiple blocks have issues. Consider validating all blocks and returning a summary of errors.

 func (v *Validator) ValidateBlocks(blocks []common.BlockData) (validBlocks []common.BlockData, invalidBlocks []common.BlockData, err error) {
     invalidBlocks = make([]common.BlockData, 0)
     validBlocks = make([]common.BlockData, 0)
+    var validationErrors []error
     for _, blockData := range blocks {
         valid, err := ValidateBlock(blockData)
         if err != nil {
             log.Error().Err(err).Msgf("Block verification failed for block %s", blockData.Block.Number)
-            return nil, nil, err
+            validationErrors = append(validationErrors, fmt.Errorf("block %s: %w", blockData.Block.Number, err))
+            continue
         }
         if valid {
             validBlocks = append(validBlocks, blockData)
         } else {
             invalidBlocks = append(invalidBlocks, blockData)
         }
     }
+    if len(validationErrors) > 0 {
+        return validBlocks, invalidBlocks, fmt.Errorf("validation errors: %v", validationErrors)
+    }
     return validBlocks, invalidBlocks, nil
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d1f463 and 0b45721.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (15)
  • cmd/root.go (1 hunks)
  • cmd/validate.go (1 hunks)
  • cmd/validate_and_fix.go (1 hunks)
  • go.mod (6 hunks)
  • internal/common/abi.go (2 hunks)
  • internal/common/trace.go (3 hunks)
  • internal/orchestrator/validator.go (1 hunks)
  • internal/rpc/rpc.go (1 hunks)
  • internal/rpc/serializer.go (2 hunks)
  • internal/storage/clickhouse.go (5 hunks)
  • internal/storage/connector.go (1 hunks)
  • internal/validation/cursor.go (1 hunks)
  • internal/validation/duplicates.go (1 hunks)
  • internal/validation/root_calculator.go (1 hunks)
  • test/mocks/MockIMainStorage.go (2 hunks)
🧬 Code Graph Analysis (1)
internal/orchestrator/validator.go (5)
internal/storage/connector.go (1)
  • IStorage (67-71)
internal/rpc/rpc.go (1)
  • IRPCClient (42-54)
internal/orchestrator/poller.go (2)
  • Poller (22-35)
  • NewBoundlessPoller (50-72)
internal/common/block.go (2)
  • BlockData (61-66)
  • Block (8-33)
internal/validation/root_calculator.go (2)
  • CalculateLogsBloom (255-289)
  • CalculateTransactionsRoot (18-253)
✅ Files skipped from review due to trivial changes (1)
  • internal/storage/connector.go
🚧 Files skipped from review as they are similar to previous changes (10)
  • internal/common/abi.go
  • internal/rpc/rpc.go
  • cmd/root.go
  • internal/rpc/serializer.go
  • cmd/validate.go
  • internal/common/trace.go
  • internal/validation/cursor.go
  • go.mod
  • internal/validation/root_calculator.go
  • test/mocks/MockIMainStorage.go
🧰 Additional context used
🧬 Code Graph Analysis (1)
internal/orchestrator/validator.go (5)
internal/storage/connector.go (1)
  • IStorage (67-71)
internal/rpc/rpc.go (1)
  • IRPCClient (42-54)
internal/orchestrator/poller.go (2)
  • Poller (22-35)
  • NewBoundlessPoller (50-72)
internal/common/block.go (2)
  • BlockData (61-66)
  • Block (8-33)
internal/validation/root_calculator.go (2)
  • CalculateLogsBloom (255-289)
  • CalculateTransactionsRoot (18-253)
🔇 Additional comments (3)
internal/storage/clickhouse.go (3)

383-384: Type updates are consistent with trace data structure changes.

The removal of .Uint64() calls aligns with the updated trace data structures mentioned in the PR summary where gas values now use uint64 types directly.

Also applies to: 1306-1307


947-947: Good addition of FINAL keyword for data consistency.

Adding the FINAL keyword ensures consistent reads by forcing ClickHouse to read the final state of data parts, which is crucial for the deduplication logic that relies on sign-based operations.


1752-1865: Well-implemented concurrent data fetching for validation.

The implementation efficiently fetches blocks, logs, and transactions concurrently using goroutines and channels. Pre-organizing logs and transactions by block number is a smart optimization. The minimal field selection and proper error handling from all goroutines are excellent practices.

@iuwqyir iuwqyir force-pushed the 05-16-chain_validation_and_fix_command branch from 0b45721 to 7f6a1dc Compare July 22, 2025 19:15
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
internal/orchestrator/validator.go (1)

80-85: Consider returning a distinct validation status for unsupported transaction types.

Currently, blocks with unsupported transaction types (> 4) are marked as valid without actual validation. This could mask data integrity issues.

internal/validation/root_calculator.go (1)

221-224: Fix contract creation handling for EIP-7702 transactions.

Similar to blob transactions, the To field handling for contract creation is incorrect.

🧹 Nitpick comments (1)
internal/orchestrator/validator.go (1)

47-63: Consider preserving partial validation results on error.

When ValidateBlock returns an error, the function discards all previously validated blocks by returning nil slices. This could lead to loss of validation progress if an error occurs partway through a large batch.

Consider returning the partial results along with the error:

 func (v *Validator) ValidateBlocks(blocks []common.BlockData) (validBlocks []common.BlockData, invalidBlocks []common.BlockData, err error) {
 	invalidBlocks = make([]common.BlockData, 0)
 	validBlocks = make([]common.BlockData, 0)
 	for _, blockData := range blocks {
 		valid, err := ValidateBlock(blockData)
 		if err != nil {
 			log.Error().Err(err).Msgf("Block verification failed for block %s", blockData.Block.Number)
-			return nil, nil, err
+			return validBlocks, invalidBlocks, err
 		}
 		if valid {
 			validBlocks = append(validBlocks, blockData)
 		} else {
 			invalidBlocks = append(invalidBlocks, blockData)
 		}
 	}
 	return validBlocks, invalidBlocks, nil
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0b45721 and 7f6a1dc.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (15)
  • cmd/root.go (1 hunks)
  • cmd/validate.go (1 hunks)
  • cmd/validate_and_fix.go (1 hunks)
  • go.mod (6 hunks)
  • internal/common/abi.go (2 hunks)
  • internal/common/trace.go (3 hunks)
  • internal/orchestrator/validator.go (1 hunks)
  • internal/rpc/rpc.go (1 hunks)
  • internal/rpc/serializer.go (2 hunks)
  • internal/storage/clickhouse.go (5 hunks)
  • internal/storage/connector.go (1 hunks)
  • internal/validation/cursor.go (1 hunks)
  • internal/validation/duplicates.go (1 hunks)
  • internal/validation/root_calculator.go (1 hunks)
  • test/mocks/MockIMainStorage.go (2 hunks)
🧬 Code Graph Analysis (2)
internal/validation/root_calculator.go (2)
internal/common/transaction.go (1)
  • Transaction (16-51)
internal/common/log.go (1)
  • Log (15-31)
internal/orchestrator/validator.go (5)
internal/storage/connector.go (1)
  • IStorage (67-71)
internal/rpc/rpc.go (1)
  • IRPCClient (42-54)
internal/orchestrator/poller.go (2)
  • Poller (22-35)
  • NewBoundlessPoller (50-72)
internal/common/block.go (2)
  • BlockData (61-66)
  • Block (8-33)
internal/validation/root_calculator.go (2)
  • CalculateLogsBloom (255-289)
  • CalculateTransactionsRoot (18-253)
✅ Files skipped from review due to trivial changes (2)
  • internal/common/abi.go
  • cmd/validate.go
🚧 Files skipped from review as they are similar to previous changes (11)
  • cmd/root.go
  • internal/rpc/rpc.go
  • internal/storage/connector.go
  • internal/rpc/serializer.go
  • internal/common/trace.go
  • internal/validation/cursor.go
  • cmd/validate_and_fix.go
  • internal/storage/clickhouse.go
  • go.mod
  • internal/validation/duplicates.go
  • test/mocks/MockIMainStorage.go
🧰 Additional context used
🧬 Code Graph Analysis (2)
internal/validation/root_calculator.go (2)
internal/common/transaction.go (1)
  • Transaction (16-51)
internal/common/log.go (1)
  • Log (15-31)
internal/orchestrator/validator.go (5)
internal/storage/connector.go (1)
  • IStorage (67-71)
internal/rpc/rpc.go (1)
  • IRPCClient (42-54)
internal/orchestrator/poller.go (2)
  • Poller (22-35)
  • NewBoundlessPoller (50-72)
internal/common/block.go (2)
  • BlockData (61-66)
  • Block (8-33)
internal/validation/root_calculator.go (2)
  • CalculateLogsBloom (255-289)
  • CalculateTransactionsRoot (18-253)
🔇 Additional comments (2)
internal/orchestrator/validator.go (1)

102-137: Well-implemented batch processing for fixing blocks.

The method efficiently processes invalid blocks in configurable batches with proper error handling and logging.

internal/validation/root_calculator.go (1)

255-289: Correct implementation of logs bloom calculation.

The function properly converts logs to go-ethereum types and uses the standard bloom filter creation method.

@iuwqyir iuwqyir merged commit f54705e into main Jul 22, 2025
6 checks passed
@iuwqyir iuwqyir deleted the 05-16-chain_validation_and_fix_command branch July 22, 2025 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants