Description
This is a first conception of the new DA light Node from our last discussions.
DA Memseq Node
Celestia back pressure consideration
Blocks are sent to Celestia network for Tx ordering. This process takes time:
- send a new blob around 2s
- await the blob confirmation and return: from 6s t 12s
So sending a new blob and receiving it, can take 14s in the worst case. Currently each blob contains one block.
This new way to use the DA in the following description tries to remove this latency from the block processing. The idea is to order Tx in the light node so that the produced block can be sent back for execution immediately. In parallel, it's sent to Celestia DA.
One condition to make this strategy work is that the block production follows the rhythm of Celestia so we get the same 14s latency issue.
To solve it, there are 2 possibilities:
-
allow Celestia to have a different block order than the one executed. Block production doesn't await that the block is processed by Celestia. -
put several blocks in the Celestia blob so the block processing doesn't depend on Celestia blob send.
Use DA to bootstrap fullnode.
The following conception removes Celestia from the critical path and uses the node DB to bootstrap the fullnode by providing saved block until the node reach the tip.
Another possibility is to use Celestia to bootstrap the node and only save the block temporarily in the node until it is sent to Celestia.
This way to go is compatible with a WAL mechanism to save the block until they are sent.
Actors:
Fullnode: A node that run the RPC and Tx executor. It gets the Tx from the user and executes Block generated by the DA node
DA node: aggregate batch in block, order Tx and propagate the block to Celestia and all connected fullnode.
Diff from before, no Leader or Follower. All nodes are fullnode. They can received Tx and execute block from Node DA.
Needs:
- FullNode: Get user Tx and aggregate in Batch
- FullNode: Send batch to DA node
- FullNode: Get block to execute from DA node
- FullNode: Be able to bootstrap from DA Node and execute old block until the Tip
- DA Node: get all Tx batch from all connected FullNode
- DA Node: Create new block from Tx batch and order its Tx.
- DA Node: send created block to Celestia network.
- DA Node: Send created block to all Fullnode that request it.
- DA Node: Persist all batch / block.
- DA Node: provide any persisted block
Tx flow:
FullNode:
-
get user Tx from RPC entry point
-
Verify Tx
-
Validate Tx
-
Create batch of Tx using Aptos Mempool
-
Send Batch to DA Node
-
Get new block from DA
-
Execute Block
DA Node:
- aggregate batch
- create block and associate new height
- Persist block + height
- provide block for any height to fullnode
- send new block to Celestia network.
flowchart TD
subgraph Fullnode
RCVTX("Get TX from RPC
Verify Tx
Validate TX
Add Tx to mempool")
MEMP("Get Batch
Sign Batch
Send Batch to DA node")
GETBL("Get block at Height
Execute block")
end
subgraph DA Sequencer Node
RCVBA("Receive Batch
Verify Batch Sign
Save batch as pending" )
BLDBL("Build block:
Get pending batch
Aggregate in block
Define new height
Save block in DB
Remove pending batch")
SDNCE("Send new block digests to Celestia DA")
MEM("Block storage:
Get block
Save block")
end
subgraph Celestia DA
RCVBL("Receive blocks as blob")
end
RCVTX-- build batch -->MEMP
MEMP-- Send batch -->RCVBA
RCVBA-- Send block -->BLDBL
MEM-- Send block digest -->SDNCE
SDNCE-- Send blob-->RCVBL
BLDBL-- Save block -->MEM
MEM-- Get at height -->GETBL
RCVBL-- Confirm DA / block height -->MEM
Functionality to develop:
1) Fullnode: Send Mempool signed batch to DA node.
Change the current Fullnode code to send batch in a specific task when it is retrieved from the Aptos CoreMempool. Remove the unnecessary loop in transaction_ingress.rs.
Add batch signs using a new Ed25519 key. Add the key to the config under maptos_config/mempool.
2) DA Node: Submit Batch and verify
Use existing gRPC submit_batch call. Remove all unnecessary calls.
Needed calls:
rpc StreamReadFromHeight (StreamReadFromHeightRequest) returns (stream StreamReadFromHeightResponse);
return one blob by height.rpc ReadAtHeight (ReadAtHeightRequest) returns (ReadAtHeightResponse);
Change return one or none.
*rpc BatchWrite (BatchWriteRequest) returns (BatchWriteResponse);
a batch is a vec of tx.
As the gRPC entry point is public, batches are signed by the fullnode and verified by the DA node.
Use Ed25519 keys.
Fullnode public keys are stored in a separate file in a list of 0x
The file is read every minute so that they can be updated without a restart of the node.
The file path is defined in the .movement/config.json
under celestia_da_light_node_config/da_light_node section.
Validate if the current whitelist validation is still required.
Save the batch Tx in memseq to persist it. Each Tx are saved individually like today.
3) Build block: Aggregate pending Tx in block
Update the code in da/movement/protocol/light_node/src/sequencer.rs::batch_write to build the new batch aggregation.
When a batch is received, all its Tx are saved in the pending batch table of memseq. When a new blob is needed, the pending Tx are aggregated in a new blob.
A new height is associated to the blob. Height are incremented one by one.
Block aggregation take as much as possible pending Tx until the max block size in bytes is reach or there's no more pending Tx.
The remaining Tx will be aggregated to the next blob.
The new blob/height is saved in memseq.
The pending integrated Tx are removed from memseq. This removal is done only after the block is built and saved in memseq block table.
4) Memseq: Persist block
Pending Tx and block are persisted using rocksdb memseq like today but never removed. Validate why Liam wanted a WAL. We need to query the storage to get blob by height.
Need tables / access:
- current blob height: unique value, write/read
- pending batches Tx: add new batch Tx, read all pending Tx, remove some pending Tx
- (blob/height): write (block/height), get block by height.
5) Send block to Celestia
As Celestia block propagation is a lot slower than normal block one, Celestia blob will contain 1 or several blocks.
During the Celestia send, produced block are keep until the celestia send return. If there's some block pending, a new blob is constructed by creating an array of block in the order they were produced, first produced at first. A new block is added to the array only if by adding it the blob size won't exceed the max Celesta block size.
To avoid missing a block in case of node crash, when Celestia module receive a block its height is persisted. When the block is sent, the persisted height of the sent blocks are removed.
When the Celestia module start, it get persisted block height and resend them in order by aggregating in blob before sending new block.
6) Query block from its height
Use gRPC entry point and memseq function to return a blob at specified heigh. Return one or none if not found.
Clean existing by removing unnecessary trait, function, process.
Code integration in existing codebase:
-
Create a new crate in network call 'movement-da-memseq-node': contains the start part of the node and the main loop that process sub task.
-
Create a new crate in
protocol-units
called 'da-memseq-node' and put each module here.