Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking: StateCommitment abstraction #11830

Open
6 tasks done
Tracked by #11241
frisitano opened this issue Oct 17, 2024 · 7 comments
Open
6 tasks done
Tracked by #11241

Tracking: StateCommitment abstraction #11830

frisitano opened this issue Oct 17, 2024 · 7 comments
Labels
A-sdk Related to reth's use as a library A-trie Related to Merkle Patricia Trie implementation C-enhancement New feature or request M-prevent-stale Prevents old inactive issues/PRs from being closed due to inactivity S-needs-triage This issue needs to be labelled

Comments

@frisitano
Copy link
Contributor

frisitano commented Oct 17, 2024

Describe the feature

Overview

This issue will be used to plan the implementation of the StateCommitment implementation. The StateCommitment will be used to provide an abstraction over the types associated with the state commitment. This will allow reth to be used with different state representations configured by the user. Initial POC implemented here.

Plan

We will implement the StateCommitment via the following units of work:

Refactor codebase to leverage StateCommitment types for all operations:

Additional context

#11786

Tasks

Preview Give feedback
  1. A-db A-trie
  2. S-blocked
  3. S-blocked
@Thegaram
Copy link
Contributor

This could be added to #11241. @mattsse

@Rjected Rjected mentioned this issue Oct 21, 2024
@Rjected
Copy link
Member

Rjected commented Oct 21, 2024

Just added to the tracking issue @Thegaram

@Rjected Rjected added A-trie Related to Merkle Patricia Trie implementation A-sdk Related to reth's use as a library C-tracking-issue An issue that collects information about a broad development initiative and removed S-needs-triage This issue needs to be labelled labels Oct 30, 2024
@Rjected Rjected changed the title StateCommitment abstraction Tracking: StateCommitment abstraction Oct 30, 2024
Copy link
Contributor

This issue is stale because it has been open for 21 days with no activity.

@github-actions github-actions bot added the S-stale This issue/PR is stale and will close with no further activity label Nov 22, 2024
@Rjected Rjected added M-prevent-stale Prevents old inactive issues/PRs from being closed due to inactivity and removed S-stale This issue/PR is stale and will close with no further activity labels Nov 27, 2024
@frisitano
Copy link
Contributor Author

We have had a change of strategy at Scroll and we will no longer be using the binary Merkle Patricia trie with Poseidon key hashing. Instead, we will be migrating to native MPT, however, we would still like to land this feature in a more conservative fashion. The primary source of the invasiveness of the proposed change was the modified key hashing scheme. If we instead continue to use the standard keccak key hashing scheme the solution becomes much simpler with no need for HashedPostStateProvider, HashedStorageProvider etc. I propose that the layer at which we introduce the abstraction is the HashBuilder layer, this means we will introduce a trait that looks something like:

pub trait HashBuilderT {
    pub fn with_updates(mut self, retain_updates: bool) -> Self;
    pub fn with_proof_retainer(mut self, retainer: ProofRetainer) -> Self;
    pub fn set_updates(&mut self, retain_updates: bool);
    pub fn split(mut self) -> (Self, HashMap<Nibbles, BranchNodeCompact>);
    pub fn take_proof_nodes(&mut self) -> ProofNodes;
    pub fn updates_len(&self) -> usize;
    pub fn add_leaf(&mut self, key: Nibbles, value: &[u8]);
    pub fn add_branch(&mut self, key: Nibbles, value: B256, stored_in_database: bool);
    pub fn root(&mut self) -> B256;
}

We can then implement this trait on the different hash builder implementations as seen in the mpt and bmpt examples below:

We would then make the consumers of the HashBuilder generic over the trait HashBuilderT. For example the StateRoot:

/// `StateRoot` is used to compute the root node of a state trie.
#[derive(Debug)]
pub struct StateRoot<T, H> {
/// The factory for trie cursors.
pub trie_cursor_factory: T,
/// The factory for hashed cursors.
pub hashed_cursor_factory: H,
/// A set of prefix sets that have changed.
pub prefix_sets: TriePrefixSets,
/// Previous intermediate state.
previous_state: Option<IntermediateStateRootState>,
/// The number of updates after which the intermediate progress should be returned.
threshold: u64,
#[cfg(feature = "metrics")]
/// State root metrics.
metrics: StateRootMetrics,
}

We would use the generic HashBuilderT type when calculating the root:

I think this yields a far less invasive change and still fulfills the function of allowing for generic state commitment. If we have an agreement on this new approach I will back out previous changes related to key hashing and then create a PR for the proposed solution.

@emhane
Copy link
Member

emhane commented Feb 12, 2025

thanks a lot for the update @frisitano

@Rjected @mattsse @klkvr @rkrasiuk

@Rjected
Copy link
Member

Rjected commented Feb 12, 2025

HashBuilder generic over the trait HashBuilderT

I don't agree with the approach to abstract the hash builder. It's abstracting over a concrete implementation detail (the hash builder), rather than providing a good abstraction and API for implementing the state root. For example, we recently reimplemented the state root computation with what we call the sparse trie. The calculation involves:

  • spawning a task
  • sending it state updates
  • waiting on its final result

And this results in a very different internal architecture, and a very different API, than the hash builder. If we were to introduce this abstraction, it would be difficult / impossible to implement the sparse trie. So I would prefer we go a different route, and think about what a state commitment abstraction would need to look like from first principles

@frisitano
Copy link
Contributor Author

We currently have the following abstraction in place:

/// The `StateCommitment` trait provides associated types for state commitment operations.
pub trait StateCommitment: std::fmt::Debug + Send + Sync + Unpin + 'static {
/// The state root type.
type StateRoot<'a, TX: DbTx + 'a>: DatabaseStateRoot<'a, TX>;
/// The storage root type.
type StorageRoot<'a, TX: DbTx + 'a>: DatabaseStorageRoot<'a, TX>;
/// The state proof type.
type StateProof<'a, TX: DbTx + 'a>: DatabaseProof<'a, TX>;
/// The state witness type.
type StateWitness<'a, TX: DbTx + 'a>: DatabaseTrieWitness<'a, TX>;
/// The key hasher type.
type KeyHasher: KeyHasher;
}

With a concrete implementation for ethereum mpt:

/// The state commitment type for Ethereum's Merkle Patricia Trie.
#[derive(Debug)]
#[non_exhaustive]
pub struct MerklePatriciaTrie;
impl StateCommitment for MerklePatriciaTrie {
type StateRoot<'a, TX: DbTx + 'a> =
StateRoot<DatabaseTrieCursorFactory<'a, TX>, DatabaseHashedCursorFactory<'a, TX>>;
type StorageRoot<'a, TX: DbTx + 'a> =
StorageRoot<DatabaseTrieCursorFactory<'a, TX>, DatabaseHashedCursorFactory<'a, TX>>;
type StateProof<'a, TX: DbTx + 'a> =
Proof<DatabaseTrieCursorFactory<'a, TX>, DatabaseHashedCursorFactory<'a, TX>>;
type StateWitness<'a, TX: DbTx + 'a> =
TrieWitness<DatabaseTrieCursorFactory<'a, TX>, DatabaseHashedCursorFactory<'a, TX>>;
type KeyHasher = KeccakKeyHasher;
}

I propose that we remove the KeyHasher trait as it will remove some complexity from the codebase. We can then look to introduce a trait for the sparse trees you describe, this would be added as a GAT in StateCommitment trait.

Let me know your thoughts. Would this introduce the correct API?

@jenpaff jenpaff added S-needs-triage This issue needs to be labelled and removed C-tracking-issue An issue that collects information about a broad development initiative labels Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sdk Related to reth's use as a library A-trie Related to Merkle Patricia Trie implementation C-enhancement New feature or request M-prevent-stale Prevents old inactive issues/PRs from being closed due to inactivity S-needs-triage This issue needs to be labelled
Projects
Status: Todo
Development

No branches or pull requests

5 participants