Skip to content

[consensus/simplex] Strengthen twins framework#3281

Open
clabby wants to merge 5 commits intomainfrom
cl/twins
Open

[consensus/simplex] Strengthen twins framework#3281
clabby wants to merge 5 commits intomainfrom
cl/twins

Conversation

@clabby
Copy link
Collaborator

@clabby clabby commented Mar 2, 2026

Overview

Upgrades the simplex twins harness to follow the campaign model from Twins: BFT Systems Made Robust (https://arxiv.org/abs/2004.10617, especially Section 5.2), replacing the previous hand-picked strategy tests with a deterministic scenario generator and executor that systematically explores adversarial schedules.

At generation time, the framework now builds partition scenarios first, then leader-plus-partition round scenarios, then multi-round arrangements, and finally executes the Cartesian product of arrangements and compromised-node assignments. This preserves deterministic reproducibility while materially increasing adversarial coverage over the old fixed/alternating/random strategy families.

For partition generation, the harness now supports configurable partition cardinality via max_partitions (bounded by participant count) instead of hard-coding only canonical two-way splits. Round scenarios are derived from these partitions and combine a designated leader with explicit twin routing masks, allowing each round to model richer communication asymmetries. Routing now explicitly returns SplitTarget::None when a sender is outside both selected recipient groups, making sparse partition scenarios representable without panicking.

For arrangement generation (Step-3), the harness now enumerates the full round-wise arrangement space when feasible and applies deterministic pruning by sampling unique arrangements when bounded by max_scenarios. This removes the previous lexicographic-prefix bias where early rounds could remain effectively fixed under small caps, and ensures pruned campaigns still vary across round positions.

Compromised-node selection remains deterministic and now composes with the richer scenario space: campaigns execute all generated scenario/compromised-set combinations (subject to caps), and seeds are derived per case for reproducible replay.

Execution semantics were also tightened. Twins use scenario-scripted leaders during adversarial rounds, then transition into a synchronous suffix with honest-only leader rotation to model post-attack liveness more faithfully. Resolver traffic remains delivered to both twins to avoid eliminating side-channel recovery paths entirely during adversarial scheduling.

Validation checks were strengthened around honest replicas: liveness waits only on honest reporters, safety asserts no conflicting finalizations among honest replicas, invalid-signature counters must remain zero, and detected faults must attribute to twin identities. This makes campaign outcomes easier to interpret and aligns checks with adversarial accountability goals.

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Mar 2, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
commonware-mcp 1b394ff Mar 06 2026, 07:04 PM

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Mar 2, 2026

Deploying monorepo with  Cloudflare Pages  Cloudflare Pages

Latest commit: 1b394ff
Status: ✅  Deploy successful!
Preview URL: https://9bbb527c.monorepo-eu0.pages.dev
Branch Preview URL: https://cl-twins.monorepo-eu0.pages.dev

View logs

@clabby clabby marked this pull request as ready for review March 6, 2026 16:08
@clabby clabby added this to Tracker Mar 6, 2026
@clabby clabby moved this to Ready for Review in Tracker Mar 6, 2026
@clabby clabby self-assigned this Mar 6, 2026
clabby and others added 5 commits March 6, 2026 11:16
## Overview

This revision upgrades the simplex twins harness to follow the campaign model from Twins: BFT Systems Made Robust (https://arxiv.org/abs/2004.10617, especially Section 5.2), replacing the previous hand-picked strategy tests with a deterministic scenario generator and executor that systematically explores adversarial schedules.

At generation time, the framework now builds partition scenarios first, then leader-plus-partition round scenarios, then multi-round arrangements, and finally executes the Cartesian product of arrangements and compromised-node assignments. This preserves deterministic reproducibility while materially increasing adversarial coverage over the old fixed/alternating/random strategy families.

For partition generation, the harness now supports configurable partition cardinality via `max_partitions` (bounded by participant count) instead of hard-coding only canonical two-way splits. Round scenarios are derived from these partitions and combine a designated leader with explicit twin routing masks, allowing each round to model richer communication asymmetries. Routing now explicitly returns `SplitTarget::None` when a sender is outside both selected recipient groups, making sparse partition scenarios representable without panicking.

For arrangement generation (paper Step-3), the harness now enumerates the full round-wise arrangement space when feasible and applies deterministic pruning by sampling unique arrangements when bounded by `max_scenarios`. This removes the previous lexicographic-prefix bias where early rounds could remain effectively fixed under small caps, and ensures pruned campaigns still vary across round positions.

Compromised-node selection remains deterministic and now composes with the richer scenario space: campaigns execute all generated scenario/compromised-set combinations (subject to caps), and seeds are derived per case for reproducible replay.

Execution semantics were also tightened. Twins use scenario-scripted leaders during adversarial rounds, then transition into a synchronous suffix with honest-only leader rotation to model post-attack liveness more faithfully. Resolver traffic remains delivered to both twins to avoid eliminating side-channel recovery paths entirely during adversarial scheduling.

Validation checks were strengthened around honest replicas: liveness waits only on honest reporters, safety asserts no conflicting finalizations among honest replicas, invalid-signature counters must remain zero, and detected faults must attribute to twin identities. This makes campaign outcomes easier to interpret and aligns checks with adversarial accountability goals.

The test suite now includes targeted regressions for deterministic case generation, partition-space expansion with higher partition counts, full-permutation coverage in unbounded mode, round-position diversity under bounded pruning, and explicit `SplitTarget::None` routing behavior for sparse partition selections.

Finally, campaign plumbing in simplex tests was updated to pass `max_partitions`, and large twins campaigns now use the same `required_containers` target (`View::new(100)`) as the core campaign tests, improving consistency while preserving deterministic execution.

Co-authored-by: Codex <codex@openai.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

let mut current = vec![0usize; n];
generate(1, n, 1, max_partitions, &mut current, &mut out);
out
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unbounded partition enumeration causes OOM for moderate n

Medium Severity

partition_scenarios enumerates all ways to assign n elements into up to max_partitions groups — this count is the sum of Stirling numbers S(n, 2..max_partitions), which grows exponentially with n. For example, S(30, 2) alone is ~537 million. round_scenarios and generate_scenarios call this unconditionally before any max_scenarios cap is applied, so the Framework parameters give a false sense of bounded memory usage. Even moderate n values (20–30) can OOM despite small max_scenarios.

Additional Locations (2)

Fix in Cursor Fix in Web

.into_iter()
.map(|idx| combination_from_rank(n, faults, idx))
.collect()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compromised sets eagerly materialize all combinations ignoring cap

Medium Severity

compromised_sets eagerly enumerates all C(n, faults) combinations via choose when total <= max_sets, ignoring max_sets as an effective memory bound. For configurations like n=30, faults=10, this materializes ~30 million vectors even if only a few are needed. The max_compromised_sets parameter in Framework creates a misleading safety guarantee since it only activates the sampling path when the combination count exceeds it, rather than always capping the output.

Fix in Cursor Fix in Web

@codecov
Copy link

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 93.86401% with 37 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.02%. Comparing base (f06069a) to head (1b394ff).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
consensus/src/simplex/mocks/twins.rs 93.38% 22 Missing and 6 partials ⚠️
consensus/src/simplex/mod.rs 95.00% 8 Missing and 1 partial ⚠️
@@            Coverage Diff             @@
##             main    #3281      +/-   ##
==========================================
- Coverage   93.02%   93.02%   -0.01%     
==========================================
  Files         418      418              
  Lines      142894   143289     +395     
  Branches     3416     3432      +16     
==========================================
+ Hits       132930   133295     +365     
- Misses       8875     8900      +25     
- Partials     1089     1094       +5     
Files with missing lines Coverage Δ
consensus/src/simplex/mod.rs 98.16% <95.00%> (-0.18%) ⬇️
consensus/src/simplex/mocks/twins.rs 93.45% <93.38%> (-3.77%) ⬇️

... and 2 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f06069a...1b394ff. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Ready for Review

Development

Successfully merging this pull request may close these issues.

2 participants