Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[commonware-storage/mmr/mem] API extensions to support generic proof generation + forgetting tweaks #456

Merged
merged 2 commits into from
Feb 7, 2025

Conversation

roberto-bayardo
Copy link
Collaborator

@roberto-bayardo roberto-bayardo commented Feb 5, 2025

  • add ability to initialize an mmr from a vector of nodes
  • add ability to request individual nodes
  • made proof generation dependent only on a simple Mmr trait instead of a specific implementation
  • clarified some of the pruning documentation
  • fixed edge case bug in ensuring position validity for proof generation after pruning
  • removed ability to forget to arbitrary point in history since it doesn't properly detect how proving is affected by it. will re-work forgetting in a subsequent PR (tracking issue: [commonware-storage/mmr] Improve forgetting behavior in mem::Mmr #459)

@@ -338,6 +353,12 @@ mod tests {
"forget_max should forget to right-most leaf of leftmost peak"
);
assert_eq!(mmr.oldest_remembered_node_pos(), 11);

// After forgetting we shouldn't be able to prove elements at or before the oldest remaining.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding tests for this 🙏

@roberto-bayardo roberto-bayardo force-pushed the mmr-tweaks branch 6 times, most recently from 5414a21 to d01638e Compare February 6, 2025 19:38
@roberto-bayardo roberto-bayardo changed the title [commonware-storage/mmr/mem] minor API extensions + edge case bug fix [commonware-storage/mmr/mem] API extensions to support generic proof generation + forgetting tweaks Feb 6, 2025
@@ -1,5 +1,5 @@
use commonware_cryptography::{Hasher, Sha256};
use commonware_storage::mmr::mem::Mmr;
use commonware_storage::mmr::{mem::Mmr, verification::VerifiableMmr};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a non-verifiable MMR in verification?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would call this verification::Storage

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed "Storage"

@@ -1,5 +1,5 @@
use commonware_cryptography::{Hasher, Sha256};
use commonware_storage::mmr::mem::Mmr;
use commonware_storage::mmr::{mem::Mmr, verification::VerifiableMmr};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I wonder if there is a clean way to put the in-memory MMR at mmr::Memory (mmr::mem::Mmr seems suboptimal)

pub fn oldest_required_element(&self) -> u64 {
match self.peak_iterator().next() {
None => {
// Degenerate case, only happens when MMR is empty.
0
}
Some((pos, height)) => {
// Rightmost leaf position in a tree is the position of its peak minus its height.
pos - height as u64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this a bug?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a bug technically in that nothing was broken, but it was just being more conservative than necessary.

- add ability to request individual nodes
- clarified some of the pruning documentation
- fixed edge case bug in ensuring position validity for proof generation after pruning
@roberto-bayardo roberto-bayardo force-pushed the mmr-tweaks branch 7 times, most recently from de81ef2 to 706e6de Compare February 7, 2025 01:07
@@ -1,10 +1,10 @@
//! A basic MMR where all nodes are stored in-memory.

use crate::mmr::hasher::Hasher;
use crate::mmr::iterator::{nodes_needing_parents, PathIterator, PeakIterator};
use crate::mmr::verification::Proof;
use crate::mmr::iterator::{nodes_needing_parents, PeakIterator};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can do crate::mmr::{} to avoid these duplicate prefix imports

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

///
/// Assumes the caller has already verified that the MMR still stores all the nodes required to
/// generate a proof the given range.
pub fn range_proof<P: Provable<H>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool to see this broken out!

@roberto-bayardo roberto-bayardo force-pushed the mmr-tweaks branch 2 times, most recently from efbf81d to f7e38ca Compare February 7, 2025 01:27
hash = hasher.node_hash(parent_pos, &self.nodes[sibling_index], &hash);
let parent_pos = self.index_to_pos(self.nodes.len());
let sibling_hash = &self.nodes[self.pos_to_index(sibling_pos)];
hash = Hasher::new(&mut self.hasher).node_hash(parent_pos, sibling_hash, &hash);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid creating a new Hasher on each iteration, you can do:

diff --git a/storage/src/mmr/mem.rs b/storage/src/mmr/mem.rs
index 78e7c494..245b389d 100644
--- a/storage/src/mmr/mem.rs
+++ b/storage/src/mmr/mem.rs
@@ -38,7 +38,8 @@ impl<H: CHasher> Storage<H> for Mmr<H> {
     }
 
     fn get_node(&self, position: u64) -> Option<&H::Digest> {
-        self.nodes.get(self.pos_to_index(position))
+        self.nodes
+            .get(Self::pos_to_index(self.oldest_remembered_pos, position))
     }
 }
 
@@ -77,29 +78,31 @@ impl<H: CHasher> Mmr<H> {
     }
 
     /// Return the position of the element given its index in the current nodes vector.
-    fn index_to_pos(&self, index: usize) -> u64 {
-        index as u64 + self.oldest_remembered_pos
+    fn index_to_pos(oldest_remembered_pos: u64, index: usize) -> u64 {
+        oldest_remembered_pos + index as u64
     }
 
     /// Return the index of the element in the current nodes vector given its position in the MMR.
-    fn pos_to_index(&self, pos: u64) -> usize {
-        (pos - self.oldest_remembered_pos) as usize
+    fn pos_to_index(oldest_remembered_pos: u64, pos: u64) -> usize {
+        (pos - oldest_remembered_pos) as usize
     }
 
     /// Add an element to the MMR and return its position in the MMR.
     pub fn add(&mut self, element: &H::Digest) -> u64 {
         let peaks = nodes_needing_parents(self.peak_iterator());
-        let element_pos = self.index_to_pos(self.nodes.len());
+        let element_pos = Self::index_to_pos(self.oldest_remembered_pos, self.nodes.len());
 
         // Insert the element into the MMR as a leaf.
-        let mut hash = Hasher::new(&mut self.hasher).leaf_hash(element_pos, element);
+        let mut hasher = Hasher::new(&mut self.hasher);
+        let mut hash = hasher.leaf_hash(element_pos, element);
         self.nodes.push(hash.clone());
 
         // Compute the new parent nodes if any, and insert them into the MMR.
         for sibling_pos in peaks.into_iter().rev() {
-            let parent_pos = self.index_to_pos(self.nodes.len());
-            let sibling_hash = &self.nodes[self.pos_to_index(sibling_pos)];
-            hash = Hasher::new(&mut self.hasher).node_hash(parent_pos, sibling_hash, &hash);
+            let parent_pos = Self::index_to_pos(self.oldest_remembered_pos, self.nodes.len());
+            let sibling_hash =
+                &self.nodes[Self::pos_to_index(self.oldest_remembered_pos, sibling_pos)];
+            hash = hasher.node_hash(parent_pos, sibling_hash, &hash);
             self.nodes.push(hash.clone());
         }
         element_pos
@@ -158,7 +161,7 @@ impl<H: CHasher> Mmr<H> {
     }
 
     fn forget_to_pos(&mut self, pos: u64) {
-        let nodes_to_remove = self.pos_to_index(pos);
+        let nodes_to_remove = Self::pos_to_index(self.oldest_remembered_pos, pos);
         self.oldest_remembered_pos = pos;
         self.nodes = self.nodes[nodes_to_remove..self.nodes.len()].to_vec();
     }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I "love" this per se (and think creating a new hasher at the start of any call is reasonable) but if you don't want to do that, this should be more efficient.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think this dilutes the value of the "pos_to_index" abstraction. I suspect the compiler will optimize away the Hasher::new overhead so I think I prefer leaving it as is.

@roberto-bayardo roberto-bayardo force-pushed the mmr-tweaks branch 2 times, most recently from 825d5e1 to f716691 Compare February 7, 2025 18:17
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to remove this?

@patrick-ogrady patrick-ogrady merged commit a6b5d40 into main Feb 7, 2025
7 checks passed
@patrick-ogrady patrick-ogrady deleted the mmr-tweaks branch February 7, 2025 18:59
Copy link

codecov bot commented Feb 7, 2025

Codecov Report

Attention: Patch coverage is 99.30556% with 1 line in your changes missing coverage. Please review.

Project coverage is 94.28%. Comparing base (3a65f7e) to head (e572183).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
storage/src/mmr/mem.rs 98.59% 1 Missing ⚠️
@@            Coverage Diff             @@
##             main     #456      +/-   ##
==========================================
+ Coverage   94.17%   94.28%   +0.10%     
==========================================
  Files          94       95       +1     
  Lines       27403    27750     +347     
==========================================
+ Hits        25808    26163     +355     
+ Misses       1595     1587       -8     
Files with missing lines Coverage Δ
storage/src/mmr/verification.rs 94.78% <100.00%> (+0.74%) ⬆️
storage/src/mmr/mem.rs 94.58% <98.59%> (+0.13%) ⬆️

... and 29 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3a65f7e...e572183. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants