Fix/gateway message processing #1991

cdamian · 2024-09-19T11:22:54Z

Description

Various fixes for LP gateway-related logic.

Changes and Descriptions

LP Gateway

Make MessageProcessor implementation transactional to ensure that storage changes are reverted on failure.
Move iteration over inbound sub-messages in execution logic.
Add session ID check for inbound message entries when checking for votes/proofs.
Use defensive weight when executing message recovery.
Add more tests that cover session ID changes and storage rollback.

LP Gateway Queue

Remove extra weight check.
Get MessageQueue keys, order them, and iterate over them when servicing the MessageQueue.

Checklist:

I have added Rust doc comments to structs, enums, traits and functions
I have made corresponding changes to the documentation
I have performed a self-review of my code
I have added tests that prove my fix is effective or that my feature works

cdamian · 2024-09-19T11:23:36Z

pallets/liquidity-pools-gateway-queue/src/lib.rs

@@ -202,9 +202,13 @@ pub mod pallet {
 		fn service_message_queue(max_weight: Weight) -> Weight {
 			let mut weight_used = Weight::zero();

-			let mut processed_entries = Vec::new();
+			let mut nonces = MessageQueue::<T>::iter_keys().collect::<Vec<_>>();
+			nonces.sort();


Since we are collecting the keys here, it also made sense to me to sort them, just in case. Please let me know if there are any objections to this.

Although this new solution is more simple, I think there is a problem here:

MessageQueue::<T>::iter_keys().collect::<Vec<_>>();

It can collect many keys, making the block impossible to build.

I think we need a complex structure here that allow us to store them already organized

It can collect many keys, making the block impossible to build.

Can you elaborate on this please?

How about limiting the number of keys that we collect via:

let mut nonces = MessageQueue::<T>::iter_keys().take(MAX_MESSAGES_PER_BLOCK).collect::<Vec<_>>();

Can you elaborate on this please?

When you collect, the iterator will make one read per item, and could be a number of items that overpass the limit for the block weight capacity.

The take(MAX_MESSAGES_PER_BLOCK) still does not work because could be left a message in the queue that ideally should be processed first.

Not sure if exists some order structure available in substrate for this. If not, we should create some complex/annoying structure to organize the way the messages are stored.

But I'm not able to see a super simple way TBH. We can leave that fix for another PR to unlock this if we see it's not easy

Maybe there's a simpler solution that involves using the latest message nonce. I'll try something on a different branch.

Something similar to - #1992

pallets/liquidity-pools-gateway-queue/src/lib.rs

pallets/liquidity-pools-gateway/src/lib.rs

pallets/liquidity-pools-gateway/src/message_processing.rs

lemunozm

Thanks for all the fixed!

BTW, not sure if we should move this to the internal repo, to not forking both main branches too much.

lemunozm · 2024-09-20T06:40:22Z

pallets/liquidity-pools-gateway-queue/src/lib.rs

@@ -202,9 +202,13 @@ pub mod pallet {
 		fn service_message_queue(max_weight: Weight) -> Weight {
 			let mut weight_used = Weight::zero();

-			let mut processed_entries = Vec::new();
+			let mut nonces = MessageQueue::<T>::iter_keys().collect::<Vec<_>>();
+			nonces.sort();


Although this new solution is more simple, I think there is a problem here:

MessageQueue::<T>::iter_keys().collect::<Vec<_>>();

It can collect many keys, making the block impossible to build.

I think we need a complex structure here that allow us to store them already organized

lemunozm · 2024-09-20T06:51:23Z

pallets/liquidity-pools-gateway/src/lib.rs

+				if res.is_ok() {
+					TransactionOutcome::Commit(Ok::<(DispatchResult, Weight), DispatchError>((
+						res, weight,
+					)))
+				} else {
+					TransactionOutcome::Rollback(Ok::<(DispatchResult, Weight), DispatchError>((
+						res, weight,
+					)))
 				}


Super NIT. I think adding the returned type in the closure allows to remove the Ok<stuff> here

with_transaction(|| -> DispatchResult {

That won't work given:

pub fn with_transaction<T, E, F>(f: F) -> Result<T, E> where E: From<DispatchError>, F: FnOnce() -> TransactionOutcome<Result<T, E>>, // this {

But given that we will always return LP_DEFENSIVE_WEIGHT for both inbound/outbound, we can change this a bit. I'll ping you when done.

pallets/liquidity-pools-gateway/src/message_processing.rs

lemunozm · 2024-09-20T06:53:15Z

pallets/liquidity-pools-gateway/src/message_processing.rs

@@ -333,7 +334,11 @@ impl<T: Config> Pallet<T> {
 				// we can return.
 				None => return Ok(()),
 				Some(stored_inbound_entry) => match stored_inbound_entry {
-					InboundEntry::Message(message_entry) => message = Some(message_entry.message),
+					InboundEntry::Message(message_entry)
+						if message_entry.session_id == session_id =>


Question: If session_id is different, should we remove the entry?

Question: If session_id is different, should we remove the entry?

We should not because then it's impossible to replay the message and funds are stuck on the EVM side. Keeping entries for an old session id can be made unstuck via execute_message_recovery.

execute_message_recovery will only increase the proof count for a specific router ID, we will still hit this logic and be unable to execute a message from an older session. Maybe we should extend execute_message_recovery and either:

set the session of a message entry to the current one;

increase the proof count - current behavior;

Bumping this again since I think we might need to add the 1st point that I mentioned above.

cc @hieronx

pallets/liquidity-pools-gateway/src/lib.rs

codecov · 2024-09-20T13:06:30Z

Codecov Report

Attention: Patch coverage is 76.66667% with 14 lines in your changes missing coverage. Please review.

Project coverage is 48.41%. Comparing base (927af06) to head (731577f).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
.../liquidity-pools-gateway/src/message_processing.rs	70.00%	6 Missing ⚠️
pallets/liquidity-pools-gateway-queue/src/lib.rs	82.14%	5 Missing ⚠️
pallets/liquidity-pools-gateway/src/lib.rs	75.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1991      +/-   ##
==========================================
+ Coverage   48.28%   48.41%   +0.13%     
==========================================
  Files         183      183              
  Lines       13406    13412       +6     
==========================================
+ Hits         6473     6494      +21     
+ Misses       6933     6918      -15

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

* lp-gateway-queue: Ensure messages are processed in order * lp-gateway-queue: Ensure processed messages are skipped * integration-tests: Remove receipt check

wischli

LGTM, thanks for tackling all critical queue related findings! Non-blocking nit and question.

pallets/liquidity-pools-gateway-queue/src/lib.rs

wischli · 2024-10-08T07:38:17Z

pallets/liquidity-pools-gateway/src/lib.rs

+			// The #[transactional] macro only works for functions that return a
+			// `DispatchResult` therefore, we need to manually add this here.


Q: Any reason for going with the overhead of with_transaction and branching over the result instead of just using #[transactional] and returning Ok(())?

Not sure what you mean here - do you mean changing the return of process to DispatchResult?

We can't use the #[transactional] macro in this function, that is not supported. We can either change the signature as I suggested in my previous message, or go with the current approach.

mustermeiszer

Thanks for tackling this!!

wischli

Re-approval

cdamian added 11 commits September 18, 2024 13:46

lp-gateway: Update message recovery comment

d8216f9

lp-gateway: Use transactional in MessageProcessor implementation

a23ff87

lp-gateway: Iterate over inbound sub-messages only during execution

3d7d223

lp-gateway: Add session ID check for message entries

afd2900

lp-gateway: Use defensive weight during message recovery

0701522

lp-gateway: Make MessageProcessor implementation transactional

8e9ae25

lp-gateway-queue: Use when checking servicing weight

471f434

lp-gateway: Move import

bc2cc3a

lp-gateway-queue: Remove extra weight check

a3e6f6a

lp-gateway-queue: Get ordered MessageQueue keys and iterate over them

6874cb8

lp-gateway: Add session ID change tests

8d27c9e

cdamian commented Sep 19, 2024

View reviewed changes

pallets/liquidity-pools-gateway-queue/src/lib.rs Show resolved Hide resolved

cdamian commented Sep 19, 2024

View reviewed changes

pallets/liquidity-pools-gateway/src/lib.rs Outdated Show resolved Hide resolved

cdamian commented Sep 19, 2024

View reviewed changes

pallets/liquidity-pools-gateway/src/message_processing.rs Outdated Show resolved Hide resolved

cdamian requested review from lemunozm, wischli and mustermeiszer September 19, 2024 11:31

cdamian marked this pull request as ready for review September 19, 2024 11:31

cdamian self-assigned this Sep 19, 2024

lemunozm reviewed Sep 20, 2024

View reviewed changes

pallets/liquidity-pools-gateway/src/lib.rs Outdated Show resolved Hide resolved

lp-gateway: Drop support for inbound batch messages

ef843b2

cdamian mentioned this pull request Sep 20, 2024

lp-gateway-queue: Ensure messages are processed in order #1992

Merged

lp-gateway-queue: Ensure messages are processed in order (#1992)

a437056

* lp-gateway-queue: Ensure messages are processed in order * lp-gateway-queue: Ensure processed messages are skipped * integration-tests: Remove receipt check

wischli previously approved these changes Oct 8, 2024

View reviewed changes

mustermeiszer previously approved these changes Oct 9, 2024

View reviewed changes

pallet: Rename event

731577f

cdamian dismissed mustermeiszer’s stale review via 731577f October 9, 2024 09:10

cdamian dismissed wischli’s stale review via 731577f October 9, 2024 09:10

wischli approved these changes Oct 9, 2024

View reviewed changes

cdamian merged commit 25e7ea5 into main Oct 9, 2024
13 of 14 checks passed

cdamian deleted the fix/gateway-message-processing branch October 9, 2024 13:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/gateway message processing #1991

Fix/gateway message processing #1991

cdamian commented Sep 19, 2024 •

edited

Loading

cdamian Sep 19, 2024

lemunozm Sep 20, 2024

cdamian Sep 20, 2024

cdamian Sep 20, 2024

lemunozm Sep 20, 2024

lemunozm Sep 20, 2024

cdamian Sep 20, 2024

cdamian Sep 20, 2024

lemunozm left a comment

lemunozm Sep 20, 2024

lemunozm Sep 20, 2024

cdamian Sep 20, 2024

cdamian Sep 20, 2024

lemunozm Sep 20, 2024

wischli Sep 20, 2024

cdamian Sep 20, 2024

cdamian Oct 9, 2024

codecov bot commented Sep 20, 2024 •

edited

Loading

wischli left a comment

wischli Oct 8, 2024

cdamian Oct 9, 2024

cdamian Oct 9, 2024

mustermeiszer left a comment

wischli left a comment

		// The #[transactional] macro only works for functions that return a
		// `DispatchResult` therefore, we need to manually add this here.

Fix/gateway message processing #1991

Fix/gateway message processing #1991

Conversation

cdamian commented Sep 19, 2024 • edited Loading

Description

Changes and Descriptions

LP Gateway

LP Gateway Queue

Checklist:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lemunozm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Sep 20, 2024 • edited Loading

Codecov Report

wischli left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mustermeiszer left a comment

Choose a reason for hiding this comment

wischli left a comment

Choose a reason for hiding this comment

cdamian commented Sep 19, 2024 •

edited

Loading

codecov bot commented Sep 20, 2024 •

edited

Loading