Skip to content

[DA node]: implement new DA light node #1117

Open
9 of 17 issues completed
Open
9 of 17 issues completed
@musitdev

Description

@musitdev

This is a first conception of the new DA light Node from our last discussions.

DA Memseq Node

Celestia back pressure consideration

Blocks are sent to Celestia network for Tx ordering. This process takes time:

  • send a new blob around 2s
  • await the blob confirmation and return: from 6s t 12s

So sending a new blob and receiving it, can take 14s in the worst case. Currently each blob contains one block.

This new way to use the DA in the following description tries to remove this latency from the block processing. The idea is to order Tx in the light node so that the produced block can be sent back for execution immediately. In parallel, it's sent to Celestia DA.

One condition to make this strategy work is that the block production follows the rhythm of Celestia so we get the same 14s latency issue.

To solve it, there are 2 possibilities:

  • allow Celestia to have a different block order than the one executed. Block production doesn't await that the block is processed by Celestia.

  • put several blocks in the Celestia blob so the block processing doesn't depend on Celestia blob send.

Use DA to bootstrap fullnode.

The following conception removes Celestia from the critical path and uses the node DB to bootstrap the fullnode by providing saved block until the node reach the tip.

Another possibility is to use Celestia to bootstrap the node and only save the block temporarily in the node until it is sent to Celestia.

This way to go is compatible with a WAL mechanism to save the block until they are sent.

Actors:

Fullnode: A node that run the RPC and Tx executor. It gets the Tx from the user and executes Block generated by the DA node
DA node: aggregate batch in block, order Tx and propagate the block to Celestia and all connected fullnode.

Diff from before, no Leader or Follower. All nodes are fullnode. They can received Tx and execute block from Node DA.

Needs:

  • FullNode: Get user Tx and aggregate in Batch
  • FullNode: Send batch to DA node
  • FullNode: Get block to execute from DA node
  • FullNode: Be able to bootstrap from DA Node and execute old block until the Tip
  • DA Node: get all Tx batch from all connected FullNode
  • DA Node: Create new block from Tx batch and order its Tx.
  • DA Node: send created block to Celestia network.
  • DA Node: Send created block to all Fullnode that request it.
  • DA Node: Persist all batch / block.
  • DA Node: provide any persisted block

Tx flow:

FullNode:

  • get user Tx from RPC entry point

  • Verify Tx

  • Validate Tx

  • Create batch of Tx using Aptos Mempool

  • Send Batch to DA Node

  • Get new block from DA

  • Execute Block

DA Node:

  • aggregate batch
  • create block and associate new height
  • Persist block + height
  • provide block for any height to fullnode
  • send new block to Celestia network.
flowchart TD
  subgraph Fullnode
    RCVTX("Get TX from RPC
    Verify Tx 
    Validate TX
    Add Tx to mempool")
    MEMP("Get Batch
    Sign Batch
    Send Batch to DA node")
    GETBL("Get block at Height
    Execute block")
  end
  subgraph DA Sequencer Node
    RCVBA("Receive Batch
    Verify Batch Sign
    Save batch as pending" )
    BLDBL("Build block:
    Get pending batch
    Aggregate in block
    Define new height
    Save block in DB
    Remove pending batch")
    SDNCE("Send new block digests to Celestia DA")
    MEM("Block storage:
    Get block
    Save block")
  end
  subgraph Celestia DA
    RCVBL("Receive blocks as blob")
  end

  RCVTX-- build batch -->MEMP
  MEMP-- Send batch -->RCVBA
  RCVBA-- Send block -->BLDBL
  MEM-- Send block digest -->SDNCE
  SDNCE-- Send blob-->RCVBL
  BLDBL-- Save block -->MEM
  MEM-- Get at height -->GETBL
  RCVBL-- Confirm DA / block height -->MEM
Loading

Functionality to develop:

1) Fullnode: Send Mempool signed batch to DA node.

Change the current Fullnode code to send batch in a specific task when it is retrieved from the Aptos CoreMempool. Remove the unnecessary loop in transaction_ingress.rs.
Add batch signs using a new Ed25519 key. Add the key to the config under maptos_config/mempool.

2) DA Node: Submit Batch and verify

Use existing gRPC submit_batch call. Remove all unnecessary calls.
Needed calls:

  • rpc StreamReadFromHeight (StreamReadFromHeightRequest) returns (stream StreamReadFromHeightResponse); return one blob by height.
  • rpc ReadAtHeight (ReadAtHeightRequest) returns (ReadAtHeightResponse); Change return one or none.
    * rpc BatchWrite (BatchWriteRequest) returns (BatchWriteResponse); a batch is a vec of tx.

As the gRPC entry point is public, batches are signed by the fullnode and verified by the DA node.
Use Ed25519 keys.
Fullnode public keys are stored in a separate file in a list of 0x
The file is read every minute so that they can be updated without a restart of the node.
The file path is defined in the .movement/config.json under celestia_da_light_node_config/da_light_node section.

Validate if the current whitelist validation is still required.

Save the batch Tx in memseq to persist it. Each Tx are saved individually like today.

3) Build block: Aggregate pending Tx in block

Update the code in da/movement/protocol/light_node/src/sequencer.rs::batch_write to build the new batch aggregation.

When a batch is received, all its Tx are saved in the pending batch table of memseq. When a new blob is needed, the pending Tx are aggregated in a new blob.
A new height is associated to the blob. Height are incremented one by one.
Block aggregation take as much as possible pending Tx until the max block size in bytes is reach or there's no more pending Tx.
The remaining Tx will be aggregated to the next blob.

The new blob/height is saved in memseq.
The pending integrated Tx are removed from memseq. This removal is done only after the block is built and saved in memseq block table.

4) Memseq: Persist block

Pending Tx and block are persisted using rocksdb memseq like today but never removed. Validate why Liam wanted a WAL. We need to query the storage to get blob by height.

Need tables / access:

  • current blob height: unique value, write/read
  • pending batches Tx: add new batch Tx, read all pending Tx, remove some pending Tx
  • (blob/height): write (block/height), get block by height.

5) Send block to Celestia

As Celestia block propagation is a lot slower than normal block one, Celestia blob will contain 1 or several blocks.
During the Celestia send, produced block are keep until the celestia send return. If there's some block pending, a new blob is constructed by creating an array of block in the order they were produced, first produced at first. A new block is added to the array only if by adding it the blob size won't exceed the max Celesta block size.

To avoid missing a block in case of node crash, when Celestia module receive a block its height is persisted. When the block is sent, the persisted height of the sent blocks are removed.

When the Celestia module start, it get persisted block height and resend them in order by aggregating in blob before sending new block.

6) Query block from its height

Use gRPC entry point and memseq function to return a blob at specified heigh. Return one or none if not found.

Clean existing by removing unnecessary trait, function, process.

Code integration in existing codebase:

  • Create a new crate in network call 'movement-da-memseq-node': contains the start part of the node and the main loop that process sub task.

  • Create a new crate in protocol-units called 'da-memseq-node' and put each module here.

Sub-issues

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions