Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scroll Primitives #7

Closed
Tracked by #16
frisitano opened this issue Oct 3, 2024 · 12 comments
Closed
Tracked by #16

Scroll Primitives #7

frisitano opened this issue Oct 3, 2024 · 12 comments
Assignees
Milestone

Comments

@frisitano
Copy link
Collaborator

frisitano commented Oct 3, 2024

Overview

Scroll introduces / modifies the transaction and transaction receipt primitives. We should implement these primitives in reth or a new primitives crate.

Transaction types

Scroll supports the following transaction types:

L1 Message Transaction (typed transaction)

L1 initiated transactions use the following transaction type associated with an EIP-2718 type of 0x7E. There are no fees payed for this transaction type on L2 as the fees are already payed for when the transaction is submitted on L1. The encoding follows the standard EIP-2718 encoding

struct L1MessageTX {
    queue_index: u64,
    gas: u64,
    to: Address,
    value: U256,
    data: Bytes,
    sender: Address,
}

Receipt

An additional field is added to the TransactionReceipt. This field includes the l1 fee. See reference here.

Account

The account field has two additional fields code_size and poseiden_code_hash. Also the account hashing scheme is different. In the first instance I propose we add these additional fields behind a feature flag but in the future we should provide an abstraction such as providing the Account object in NodePrimites.

Design

Currently the Transaction type (and associated types, e.g. TxType, PooledTransactionsElement) and Receipt are hardcoded. Modifications to these types are currently achieved via feature flags as seen by the optimism feature flag. In the first instance we should implement the scroll changes behind a feature flag but we should consult with the reth team to see if we can provide the necessary abstractions to allow for configurability of these types.

@frisitano frisitano added this to the Milestone 1 milestone Oct 3, 2024
@frisitano frisitano moved this to Todo in Reth Oct 3, 2024
@frisitano frisitano added this to Reth Oct 3, 2024
@Thegaram
Copy link

Thegaram commented Oct 3, 2024

Edited to add "Type 1" to the supported transaction types.

@frisitano frisitano mentioned this issue Oct 8, 2024
10 tasks
@Thegaram
Copy link

Relevant reth tracking issue: paradigmxyz#11239

@greged93
Copy link
Collaborator

greged93 commented Nov 4, 2024

Overview

The account structure used in Scroll contains two extra fields: code_size and poseidon_code_hash. code_size needs to be present for the execution and both need to be present for the computation of the state root, see following link for the hashing scheme.

Propositions

  1. Abstract the Account. Ongoing work is being done upstream for storage abstraction which would be required.
  2. Add a feature cfg(feature = "scroll") for now, modifying the Account from Reth. This currently limits the amount of change we need to get in the code, and helps us get a better feeling about how to better abstract it.

Details

  1. Account Abstraction:
  • After execution of the payload, the HashedPostState is populated with the accounts on which changes have occurred. These accounts would need to be abstracted and implement a trait which would allow defining the hashing of the account’s address and the encoding of the account.
  • If we can extract the storage root computation of accounts outside of the StateRoot, it becomes account agnostic. The passed HashedPostState could just contain HashMap<B256, Option<Vec<u8>>.
  1. Feature gating: modify the type itself and define the way to convert a BundleAccount from revm into an Account with the additional fields.

@frisitano
Copy link
Collaborator Author

Below I will specify the changes that could be implemented in reth and revm to support the different account structure of scroll.

Reth Account Abstraction

First we will introduce an Account trait with basic functionality and implement it on our concrete ScrollAccount type.

pub trait AccountT {
    const fn has_bytecode(&self) -> bool;
    fn is_empty(&self) -> bool;
    fn get_bytecode_hash(&self) -> B256;
    fn get_nonce(&self) -> u64;
    fn get_balance(&self) -> U256;
}

pub struct ScrollAccount {
    nonce: u64,
    balance: U256,
    bytecode_hash: Option<B256>,
    poseidon_bytecode_hash: Option<B256>,
    code_size: Option<usize>,
}

impl AccountT for ScrollAccount {
    const fn has_bytecode(&self) -> bool {
        self.bytecode_hash.is_some()
    }

    fn is_empty(&self) -> bool {
        self.nonce == 0 &&
            self.balance.is_zero() &&
            self.bytecode_hash.map_or(true, |hash| hash == KECCAK_EMPTY)
    }

    fn get_bytecode_hash(&self) -> B256 {
        self.bytecode_hash.unwrap_or(KECCAK_EMPTY)
    }

    fn get_nonce(&self) -> u64 {
        self.nonce
    }

    fn get_balance(&self) -> U256 {
        self.balance
    }
}

We would then provide the concrete ScrollAccount type in scroll NodePrimitives as follows:

pub trait NodePrimitives {
    type Account: AccountT;
}

impl NodePrimitives for ScrollPrimitives {
    type Account = ScrollAccount;
}

Revm AccountInfo abstraction

In revm we would need to introduce a trait for AccountInfo. This would look something like this:

pub trait AccountInfoT {
    fn new(balance: U256, nonce: u64, code_hash: B256, code: Bytecode) -> Self;
    fn balance(&self) -> U256;
    fn nonce(&self) -> u64;
    fn code(&self) -> Option<&Bytecode>;
    fn code_hash(&self) -> B256;
    fn set_code(&mut self, code: Bytecode);
    fn set_balance(&mut self, balance: U256);
    fn set_nonce(&mut self, nonce: u64);
    fn copy_without_code(&self) -> Self;
    fn is_empty(&self) -> bool;
    fn exists(&self) -> bool;
    fn has_no_code_and_nonce(&self) -> bool;
    fn is_empty_code_hash(&self) -> bool;
    fn take_bytecode(&mut self) -> Option<Bytecode>;
    fn from_balance(balance: U256) -> Self;
    fn from_bytecode(bytecode: Bytecode) -> Self;
}

pub struct ScrollAccountInfo {
    balance: U256,
    nonce: u64,
    code_hash: B256,
    code: Option<Bytecode>,
    poseidon_code_hash: B256,
    code_size: usize,
}

impl AccountInfoT for ScrollAccountInfo {
    ...
}

We would then provide the concrete Account type by adding a new associated type in EvmWirign and specifying the ScrollAccountInfo type when configuring the evm.

pub trait EvmWiring: Sized {
    /// External context type
    type ExternalContext: Sized;

    /// Chain context type.
    type ChainContext: Sized + Default + Debug;

    /// Database type.
    type Database: Database;

    /// Account type.
    type Account: AccountT;

    /// The type that contains all block information.
    type Block: Block;

    /// The type that contains all transaction information.
    type Transaction: Transaction;

    /// The type that enumerates the chain's hardforks.
    type Hardfork: HardforkTrait;

    /// Halt reason type.
    type HaltReason: HaltReasonTrait;
}

This is a very invasive change and impacts many different data structures within revm so I think this is a no-go.

ScrollStateProviderDatabase and BundleState context

This solution would mitigate any changes needed in revm and would keep the additional account fields encapsulated in reth.

The idea would be to introduce a new ScrollStateProviderDatabase which is a modified version of StateProviderDatabase. This stateful object would include a cache for scroll code fields which is code_hash -> (code_size, poseidon_code_size) as seen below:

pub struct ScrollStateProviderDatabase<DB>{
    scroll_code: HashMap<B256, (usize, B256)>,   
    pub db: DB
}

For the implementation of DatabaseRef on ScrollStateProviderDatabase we would cache the scroll code fields before returning the AccountInfo as follows:

impl<DB: EvmStateProvider> DatabaseRef for ScrollStateProviderDatabase<DB> {
    type Error = <Self as Database>::Error;

    /// Retrieves basic account information for a given address.
    ///
    /// Returns `Ok` with `Some(AccountInfo)` if the account exists,
    /// `None` if it doesn't, or an error if encountered.
    fn basic_ref(&self, address: Address) -> Result<Option<AccountInfo>, Self::Error> {
        let account = self.basic_account(address)?;
        let (account_info, scroll_code) = account.into_parts();
        self.scroll_code.insert(account.info.code_hash, scroll_code);
        Ok(account_info)
    }

We then look to introduce an additional generic context field on BundleState, hopefully upstream should support this as the surface area shouldn't be too large.

pub struct BundleState<C> {
    /// Account state.
    pub state: HashMap<Address, BundleAccount>,
    /// All created contracts in this block.
    pub contracts: HashMap<B256, Bytecode>,
    /// Additional context associated with the bundle state.
    pub context: C,
    /// Changes to revert.
    ///
    /// Note: Inside vector is *not* sorted by address.
    /// But it is unique by address.
    pub reverts: Reverts,
    /// The size of the plain state in the bundle state.
    pub state_size: usize,
    /// The size of reverts in the bundle state.
    pub reverts_size: usize,
}

Then on State we can implement a trait TakeScrollCode which will take the scroll code map from the database and store it in BundleState.

pub trait MergeScrollCode {
    fn merge_scroll_code(&mut self, code_hash: B256);
}

impl<DB> MergeScrollCode for State<DB> {
    fn merge_scroll_code(&mut self, code_hash: B256) {
        let scroll_code = self.db.scroll_code.take();
        self.bundle_state.context = scroll_code;
    }
}

This will mean that all code size and poseiden code hashes that were read from the database are available in BundleState. For newly deployed contract code we will have the bytecode available in the BundleState such that we can compute the poseiden hash and code size at runtime. Therefore when we convert from BundleState into HashedPostState which contains a mapping of ScrollAccount's we will have all of the code information available in the environment to populate those additional fields.

@greged93
Copy link
Collaborator

greged93 commented Nov 5, 2024

I agree with the second solution. Just two nits:

  1. I think we need Database and not DatabaseRef implemented, required by BlockExecutorProvider for live sync block execution:
    /// Creates a new executor for single block execution.
    ///
    /// This is used to execute a single block and get the changed state.
    fn executor<DB>(&self, db: DB) -> Self::Executor<DB>
    where
        DB: Database<Error: Into<ProviderError> + Display>;

and similar bound is required for execution during pipelining.

  1. While waiting for the update to the BundleState to be merged upstream, I propose introducing our own ScrollBundleState:
pub struct ScrollBundleState {
    context: HashMap<B256, (usize, B256)>,
    bundle: BundleState
}

which we can re-export via the reth_evm crate:

// Convenience re-exports.
pub use revm::{self, *};
#[cfg(feature = "scroll")]
pub use revm_scroll::ScrollBundleState as BundleState

@frisitano
Copy link
Collaborator Author

I agree with the second solution. Just two nits:

  1. I think we need Database and not DatabaseRef implemented, required by BlockExecutorProvider for live sync block execution.

Agreed, the DatabaseRef trait wouldn't work as it takes an immutable &self reference but we need to modify the code cache.

  1. While waiting for the update to the BundleState to be merged upstream, I propose introducing our own ScrollBundleState:
pub struct ScrollBundleState {
    context: HashMap<B256, (usize, B256)>,
    bundle: BundleState
}

Looks reasonable to me. The only question I have is do we want to unpack fields from BundleState such that they are at the root level of ScrollBundleState. I haven't looked at the code but I wonder if there are places in the code where public fields of BundleState are accessed directly.

I would propose that we open an issue in upstream revm to suggest the introduction of this new generic context field to get feedback. Would you be able to do that please @greged93?

@greged93
Copy link
Collaborator

greged93 commented Nov 6, 2024

After some research on the areas of the codebase affected by the above changes, I came across the below related to persistence:

  • During both live sync and pipelining, state is persisted into the database using the StateChangeWriter.
  • Methods write_state_reverts and write_state_changes from the trait receive PlainStateReverts and StateChangeset respectively. These structures don't have access to Account but rather AccountInfo from revm.
  • Method write_hashed_state receives a HashedPostStateSorted which does use the Account.
  • This means we wouldn't currently be able to maintain correct state in database for the code_size and the poseidon_code_hash fields when writing reverts or changes, while it is possible for the hashed state.

The below details a proposed approach to abstract the BundleState completely, in order to avoid issues with the StateChangeWriter:

Add a ExecutionPrimitives GAT into the NodePrimitives:

pub trait NodePrimitives {
    /// Block primitive.
    type Block;
    /// Node execution primitives
    type ExecutionPrimitives: NodeExecutionPrimitives;
}

We then bound the ExecutionPrimitives GAT with NodeExecutionPrimitives, which we use to define an additional execution context and a BundleState.

pub trait NodeExecutionPrimitives {
    /// Additional context extracted during execution
    type Context;
    /// Bundle state of values changed during execution
    type BundleState: From<(revm::db::BundleState, NodeExecutionPrimitives::Context)> + BundleStateT;
}

pub trait BundleStateT {
    type Reverts;
    type StateChangeset;
    ...
}

As an example, for Scroll, we would implement the following:

type KeccakHash = B256;
type PoseidonHash = B256;
type CodeSize = usize;
type ScrollExecutionContext = HashMap<KeccakHash, (CodeSize, PoseidonHash)>;

pub struct ScrollBundleState {
    ...,
}

impl From<(revm::db::BundleState, ScrollExecutionContext)> for ScrollBundleState {
    fn from(value: (BundleState, ScrollExecutionContext)) -> Self {
        /// Construct `ScrollBundleState` fields from the bundle and the additional context
        Self {
            ...,
        }
    }
}

pub struct ScrollExecutionPrimitives;

impl NodeExecutionPrimitives for ScrollExecutionPrimitives {
    type Context = ScrollExecutionContext;
    type BundleState = ScrollBundleState;
}

Using the ScrollStateProviderDatabase, we can then retrieve this context and convert the revm::BundleState to the NodeExecutionPrimitives::BundleState.

We will start implementing the above when time allows it and pushing it to upstream, but in the meantime will progress by defining our own ScrollBundleState and re-exporting it via a feature flag in reth_revm.
The ScrollBundleState would use a ScrollAccountInfo in place of the AccountInfo from revm, as well as our own ScrollPlainStateReverts and ScrollStateChangeset, which allows us to avoid the afore mentioned issue.

@frisitano
Copy link
Collaborator Author

This looks good to me, just a few points. Do we need to introduce NodeExecutionPrimitives in NodePrimitives or can the concrete BundleState type be inferred by what is returned from the Executor? Maybe we just need to add the BundleState generic on ExecutionOutcome?

For the ScrollBundleState do we need to store the context field or can we just parse it during the conversion From<(revm::db::BundleState, ScrollExecutionContext)> for ScrollBundleState such that we construct ScrollAccountInfo in the BundleAccount's within ScrollBundleState?

@greged93
Copy link
Collaborator

greged93 commented Nov 6, 2024

This looks good to me, just a few points. Do we need to introduce NodeExecutionPrimitives in NodePrimitives or can the concrete BundleState type be inferred by what is returned from the Executor? Maybe we just need to add the BundleState generic on ExecutionOutcome?

I see, I think that would probably work and avoid the NodeExecutionPrimitives.

For the ScrollBundleState do we need to store the context field or can we just parse it during the conversion From<(revm::db::BundleState, ScrollExecutionContext)> for ScrollBundleState such that we construct ScrollAccountInfo in the BundleAccount's within ScrollBundleState?

Yes this was a mistake in the definition indeed.

@greged93
Copy link
Collaborator

greged93 commented Nov 7, 2024

This looks good to me, just a few points. Do we need to introduce NodeExecutionPrimitives in NodePrimitives or can the concrete BundleState type be inferred by what is returned from the Executor? Maybe we just need to add the BundleState generic on ExecutionOutcome?

After some discussion, the following route has been chosen:

Take advantage of the Output GAT on Executor and BatchExecutor by extracting them one level higher in the BlockExecutorProvider:

pub trait BlockExecutorProvider: Send + Sync + Clone + Unpin + 'static {
    type ExecutorOutput;
    type BatchExecutorOutput;

    type Executor<DB: Database<Error: Into<ProviderError> + Display>>: for<'a> Executor<
        DB,
        Input<'a> = BlockExecutionInput<'a, BlockWithSenders>,
        Output = Self::ExecutorOutput,
        Error = BlockExecutionError,
    >;
    type BatchExecutor<DB: Database<Error: Into<ProviderError> + Display>>: for<'a> BatchExecutor<
        DB,
        Input<'a> = BlockExecutionInput<'a, BlockWithSenders>,
        Output = BatchExecutorOutput,
        Error = BlockExecutionError,
    >;
    ...

This removes the need for the NodeExecutionPrimitives and allows us to define in our BlockExecutorProvider the structure returned after execution.

We will need to bound the ExecutorOutput and the BatchExecutorOutput by an additional trait ExecutionOutcome which would provide some fetching functionality. A fitting candidate would be the ExecutionDataProvider, which would need to be slightly updated

@greged93
Copy link
Collaborator

greged93 commented Dec 6, 2024

@frisitano I think we can close this

@frisitano
Copy link
Collaborator Author

@frisitano I think we can close this

agreed.

closed by #39, #33, #41, #42

@github-project-automation github-project-automation bot moved this from In Progress to Done in Reth Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

3 participants