Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/gateway message processing #1991

Merged
merged 14 commits into from
Oct 9, 2024
Merged

Fix/gateway message processing #1991

merged 14 commits into from
Oct 9, 2024

Conversation

cdamian
Copy link
Contributor

@cdamian cdamian commented Sep 19, 2024

Description

Various fixes for LP gateway-related logic.

Changes and Descriptions

LP Gateway

  • Make MessageProcessor implementation transactional to ensure that storage changes are reverted on failure.
  • Move iteration over inbound sub-messages in execution logic.
  • Add session ID check for inbound message entries when checking for votes/proofs.
  • Use defensive weight when executing message recovery.
  • Add more tests that cover session ID changes and storage rollback.

LP Gateway Queue

  • Remove extra weight check.
  • Get MessageQueue keys, order them, and iterate over them when servicing the MessageQueue.

Checklist:

  • I have added Rust doc comments to structs, enums, traits and functions
  • I have made corresponding changes to the documentation
  • I have performed a self-review of my code
  • I have added tests that prove my fix is effective or that my feature works

@@ -202,9 +202,13 @@ pub mod pallet {
fn service_message_queue(max_weight: Weight) -> Weight {
let mut weight_used = Weight::zero();

let mut processed_entries = Vec::new();
let mut nonces = MessageQueue::<T>::iter_keys().collect::<Vec<_>>();
nonces.sort();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are collecting the keys here, it also made sense to me to sort them, just in case. Please let me know if there are any objections to this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this new solution is more simple, I think there is a problem here:

MessageQueue::<T>::iter_keys().collect::<Vec<_>>();

It can collect many keys, making the block impossible to build.

I think we need a complex structure here that allow us to store them already organized

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can collect many keys, making the block impossible to build.

Can you elaborate on this please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about limiting the number of keys that we collect via:

let mut nonces = MessageQueue::<T>::iter_keys().take(MAX_MESSAGES_PER_BLOCK).collect::<Vec<_>>();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on this please?

When you collect, the iterator will make one read per item, and could be a number of items that overpass the limit for the block weight capacity.

The take(MAX_MESSAGES_PER_BLOCK) still does not work because could be left a message in the queue that ideally should be processed first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if exists some order structure available in substrate for this. If not, we should create some complex/annoying structure to organize the way the messages are stored.

But I'm not able to see a super simple way TBH. We can leave that fix for another PR to unlock this if we see it's not easy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there's a simpler solution that involves using the latest message nonce. I'll try something on a different branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something similar to - #1992

@cdamian cdamian marked this pull request as ready for review September 19, 2024 11:31
@cdamian cdamian self-assigned this Sep 19, 2024
Copy link
Contributor

@lemunozm lemunozm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the fixed!

BTW, not sure if we should move this to the internal repo, to not forking both main branches too much.

@@ -202,9 +202,13 @@ pub mod pallet {
fn service_message_queue(max_weight: Weight) -> Weight {
let mut weight_used = Weight::zero();

let mut processed_entries = Vec::new();
let mut nonces = MessageQueue::<T>::iter_keys().collect::<Vec<_>>();
nonces.sort();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this new solution is more simple, I think there is a problem here:

MessageQueue::<T>::iter_keys().collect::<Vec<_>>();

It can collect many keys, making the block impossible to build.

I think we need a complex structure here that allow us to store them already organized

Comment on lines 668 to 676
if res.is_ok() {
TransactionOutcome::Commit(Ok::<(DispatchResult, Weight), DispatchError>((
res, weight,
)))
} else {
TransactionOutcome::Rollback(Ok::<(DispatchResult, Weight), DispatchError>((
res, weight,
)))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super NIT. I think adding the returned type in the closure allows to remove the Ok<stuff> here

with_transaction(|| -> DispatchResult {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That won't work given:

pub fn with_transaction<T, E, F>(f: F) -> Result<T, E>
where
	E: From<DispatchError>,
	F: FnOnce() -> TransactionOutcome<Result<T, E>>, // this
{

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But given that we will always return LP_DEFENSIVE_WEIGHT for both inbound/outbound, we can change this a bit. I'll ping you when done.

pallets/liquidity-pools-gateway/src/message_processing.rs Outdated Show resolved Hide resolved
@@ -333,7 +334,11 @@ impl<T: Config> Pallet<T> {
// we can return.
None => return Ok(()),
Some(stored_inbound_entry) => match stored_inbound_entry {
InboundEntry::Message(message_entry) => message = Some(message_entry.message),
InboundEntry::Message(message_entry)
if message_entry.session_id == session_id =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: If session_id is different, should we remove the entry?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: If session_id is different, should we remove the entry?

We should not because then it's impossible to replay the message and funds are stuck on the EVM side. Keeping entries for an old session id can be made unstuck via execute_message_recovery.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

execute_message_recovery will only increase the proof count for a specific router ID, we will still hit this logic and be unable to execute a message from an older session. Maybe we should extend execute_message_recovery and either:

  • set the session of a message entry to the current one;
  • increase the proof count - current behavior;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumping this again since I think we might need to add the 1st point that I mentioned above.

cc @hieronx

pallets/liquidity-pools-gateway/src/lib.rs Outdated Show resolved Hide resolved
pallets/liquidity-pools-gateway/src/lib.rs Outdated Show resolved Hide resolved
Copy link

codecov bot commented Sep 20, 2024

Codecov Report

Attention: Patch coverage is 76.66667% with 14 lines in your changes missing coverage. Please review.

Project coverage is 48.41%. Comparing base (927af06) to head (731577f).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
.../liquidity-pools-gateway/src/message_processing.rs 70.00% 6 Missing ⚠️
pallets/liquidity-pools-gateway-queue/src/lib.rs 82.14% 5 Missing ⚠️
pallets/liquidity-pools-gateway/src/lib.rs 75.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1991      +/-   ##
==========================================
+ Coverage   48.28%   48.41%   +0.13%     
==========================================
  Files         183      183              
  Lines       13406    13412       +6     
==========================================
+ Hits         6473     6494      +21     
+ Misses       6933     6918      -15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

* lp-gateway-queue: Ensure messages are processed in order

* lp-gateway-queue: Ensure processed messages are skipped

* integration-tests: Remove receipt check
wischli
wischli previously approved these changes Oct 8, 2024
Copy link
Contributor

@wischli wischli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for tackling all critical queue related findings! Non-blocking nit and question.

pallets/liquidity-pools-gateway-queue/src/lib.rs Outdated Show resolved Hide resolved
Comment on lines +627 to +628
// The #[transactional] macro only works for functions that return a
// `DispatchResult` therefore, we need to manually add this here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Any reason for going with the overhead of with_transaction and branching over the result instead of just using #[transactional] and returning Ok(())?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean here - do you mean changing the return of process to DispatchResult?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use the #[transactional] macro in this function, that is not supported. We can either change the signature as I suggested in my previous message, or go with the current approach.

mustermeiszer
mustermeiszer previously approved these changes Oct 9, 2024
Copy link
Collaborator

@mustermeiszer mustermeiszer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this!!

Copy link
Contributor

@wischli wischli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approval

@cdamian cdamian merged commit 25e7ea5 into main Oct 9, 2024
13 of 14 checks passed
@cdamian cdamian deleted the fix/gateway-message-processing branch October 9, 2024 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants