fix: exclude first and last window from timing stats, not the first and last element in each window by mattyg · Pull Request #531 · holochain/wind-tunnel

mattyg · 2026-02-10T21:51:42Z

Summary

Previously in the standard timing stats, the first and last element within a window was dropped. Now the first and last window is dropped. This is because we assume they have data from a incomplete range within the window, so will add noise to metrics.

TODO:

All code changes are reflected in docs, including module-level docs
All new/edited/removed scenarios are reflected in summary visualiser tool (see checklist)
I ran the Nomad CI workflow successfully on my branch

Note that all commits in a PR must follow Conventional Commits before it can be merged, as these are used to generate the changelog

Summary by CodeRabbit

Refactor
- Simplified windowed mean computation, producing updated trend values and occasional trend-length changes in performance summaries.
Bug Fixes
- Previously empty trend series filled and many timing trend points renormalized or slightly adjusted.
Notes
- Most public interfaces unchanged, though several summary JSON metric groupings were reorganized, altering some exported metric shapes.

coderabbitai · 2026-02-10T21:51:58Z

Walkthrough

Small change in windowed mean aggregation in summariser/src/analyze.rs (compute mean directly). No public API changes. Many JSON test outputs under summariser/test_data/3_summary_outputs/ were updated — numeric trend values, some trend lengths, and in several files the holochain_p2p_metrics JSON shape was reorganized.

Changes

Cohort / File(s)	Summary
Core logic `summariser/src/analyze.rs`	Replaced prior slice/reverse/slice partial-window handling with a direct window mean: `col(column).mean().alias("mean")`. No exported signatures changed.
Numerical trend updates `summariser/test_data/3_summary_outputs/app_install-52bd66...ec.json`, `.../app_install-8468d6...55a.json`, `.../dht_sync_lag-3a1e33...e51.json`, `.../first_call-608026...30.json`, `.../local_signals-1fd6a6...a7060.json`, `.../mixed_arc_get_agent_activity-...1727.json`, `.../remote_call_rate-...cd74.json`, `.../remote_signals-...06a.json`, `.../single_write_many_read-...397.json`, `.../two_party_countersigning-...68c.json`, `.../validation_receipts-...70.json`, `.../write_get_agent_activity-...ee7.json`, `.../write_query-...c7233.json`, `.../write_read-...ef1c.json`, `.../write_validated-...2b7.json`, `.../write_validated_must_get_agent_activity-...792c.json`, `.../zome_call_single_value-...e99b.json`	Pointwise numeric changes across many `trend` arrays: small value adjustments, occasional single-value corrections, some trends extended by one element. Window metadata (e.g., `window_duration`, mean/std) generally preserved.
Files with JSON schema/structure changes `summariser/test_data/3_summary_outputs/full_arc_create_validated_zero_arc_read-...c83.json`, `.../mixed_arc_must_get_agent_activity-...911c.json`, `.../zero_arc_create_and_read-...88.json`, `.../zero_arc_create_data-...37d.json`, `.../zero_arc_create_data_validated-...9bc.json`, `.../mixed_arc_get_agent_activity-...1727.json`	Reorganization of `holochain_p2p_metrics` (introduced `request_roundtrip_duration`, `handle_incoming_request_duration`, reorganized/expanded `request_count` and `handle_incoming_request_count` fields). Multiple trend arrays expanded to multi-line formatting and several previously-empty trends populated. These changes alter the public JSON shape in affected files.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

ThetaSinner
jost-s
veeso

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title clearly and specifically describes the main change: excluding the first and last window instead of the first and last element within windows.
Description check	✅ Passed	The description provides a clear summary of the behavior change and rationale, and includes all required TODO items from the template.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/exclude-trailing-windows

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-10T21:52:25Z

The following will be added to the changelog

[0.6.1] - 2026-02-13

Features

Do not exclude first and last element from each window in standard_timing_stats

ThetaSinner

I agree that the existing code was a strange heuristic and should probably have been commented to explain it but there was reasoning behind it.

The first version of this code was more like this change. However, the reason I changed away from that was throwing away whole windows sets a lower bound on how long scenarios must run for. With a 10s window, you need to run for at least 40s to see a trend. Sure that only affects the summariser testing really. For a real run though, with a 60s window - discarding 2 minutes of a 15 minute run is quite significant.

What I was trying to achieve was improved accuracy without discarding too much data. Perhaps the best option here is to just let the first and last windows be less accurate? So removing the slice(1, u32::MAX) calls as you have here, but also not discarding whole windows?

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

summariser/src/analyze.rs (1)
69-95: ⚠️ Potential issue | 🟠 Major

Timing trend does not exclude first and last windows, contrary to PR objective.

The PR description states: "The change makes the code drop the first and last entire window instead." However, the new code computes the trend over all windows without any exclusion. Compare with standard_rate (lines 242–251), which explicitly slices off the first and last windows with slice(1, rate.height() - 2) and a comment explaining the rationale.

The test data supports this concern—in the zero_arc fixture, the first trend value (30.14) is a clear outlier compared to the rest (~40–43), which is consistent with an incomplete first window that should be dropped.
Suggested fix: apply the same first/last window exclusion as `standard_rate`
         .agg([col(column).mean().alias("mean")])
         .collect()
         .context("Windowed mean")?
-        .column("mean")?
-        .f64()?
-        .iter()
-        .filter_map(|v| v.map(|v| round_to_n_dp(v, 6)))
-        .collect_vec();
+        ;
+
+    // Slice to drop the first and last because they're likely to be partially filled windows.
+    let trend: Vec<f64> = if trend.height() <= 2 {
+        Vec::with_capacity(0)
+    } else {
+        trend
+            .column("mean")?
+            .slice(1, trend.height() - 2)
+            .f64()?
+            .iter()
+            .filter_map(|v| v.map(|v| round_to_n_dp(v, 6)))
+            .collect_vec()
+    };

🤖 Fix all issues with AI agents

In
`@summariser/test_data/3_summary_outputs/two_party_countersigning-3cdc5a29d42fbe93e971508e7cd8856367465e047c870b60283ed86a5bf5687c.json`:
- Around line 55-65: The timing trend fixtures in this JSON (accepted_timing and
initiated_timing arrays) are inconsistent with the updated timing trend logic in
analyze.rs that excludes the first and last windows; update the fixtures by
re-running the timing analysis to regenerate both accepted_timing and
initiated_timing trend arrays so they match the new behavior in analyze.rs, then
replace the old arrays in this file with the newly generated values.

In
`@summariser/test_data/3_summary_outputs/zero_arc_create_and_read-94d299900c09e3ae81800bfce0f497a9d12f98a7a4fd6fd4083793335bc18c88.json`:
- Around line 79-86: The JSON test fixture's "trend" array contains a low first
value (30.14) indicating a partial initial window; since analyze.rs was changed
to exclude first/last partial windows from timing trends, regenerate this
fixture so the "trend" array is computed after dropping partial windows (i.e.,
omit the first and/or last window entries as analyze.rs now does) and replace
the values in
zero_arc_create_and_read-94d299900c09e3ae81800bfce0f497a9d12f98a7a4fd6fd4083793335bc18c88.json
accordingly so the trend reflects only fully-filled windows.

...o_party_countersigning-3cdc5a29d42fbe93e971508e7cd8856367465e047c870b60283ed86a5bf5687c.json

coderabbitai · 2026-02-13T17:24:08Z

...ro_arc_create_and_read-94d299900c09e3ae81800bfce0f497a9d12f98a7a4fd6fd4083793335bc18c88.json

              "trend": [
-                43.874391,
-                43.586477,
-                40.590589,
-                40.197726,
-                41.167061
+                30.140288,
+                43.715731,
+                43.443675,
+                40.569253,
+                40.128692,
+                41.117841
              ],


⚠️ Potential issue | 🟠 Major

Test data includes suspected partial windows.

The first trend value (30.14) is significantly lower than the remaining values (~40–43), which is characteristic of a partially-filled first window. If the timing trend is updated to exclude first/last windows (as flagged in analyze.rs), this fixture will need to be regenerated.

🤖 Prompt for AI Agents

In `@summariser/test_data/3_summary_outputs/zero_arc_create_and_read-94d299900c09e3ae81800bfce0f497a9d12f98a7a4fd6fd4083793335bc18c88.json` around lines 79 - 86, The JSON test fixture's "trend" array contains a low first value (30.14) indicating a partial initial window; since analyze.rs was changed to exclude first/last partial windows from timing trends, regenerate this fixture so the "trend" array is computed after dropping partial windows (i.e., omit the first and/or last window entries as analyze.rs now does) and replace the values in zero_arc_create_and_read-94d299900c09e3ae81800bfce0f497a9d12f98a7a4fd6fd4083793335bc18c88.json accordingly so the trend reflects only fully-filled windows.

…ard_timing_stats

cocogitto-bot · 2026-02-13T17:24:44Z

✔️ 81a07aa - Conventional commits check succeeded.

mattyg · 2026-02-13T17:26:36Z

What I was trying to achieve was improved accuracy without discarding too much data. Perhaps the best option here is to just let the first and last windows be less accurate? So removing the slice(1, u32::MAX) calls as you have here, but also not discarding whole windows?

Updated to keep all results and not discard windows. We're also discarding the first and last windows in standard rate stats -- should that be changed as well? @ThetaSinner

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

summariser/src/analyze.rs (1)
88-95: ⚠️ Potential issue | 🟠 Major

Missing first/last window exclusion from the timing trend.

The PR objective states the behavior should change from dropping the first and last element within each window to dropping the first and last entire window. The old per-element slicing was removed (good), but no window-level exclusion was added to the trend.

Compare with standard_rate (lines 242–261), which explicitly slices off the first and last windows:
// standard_rate, lines 242-251
let trend: Vec<u32> = if rate.height() <= 2 {
    Vec::with_capacity(0)
} else {
    rate.column("count")?
        .slice(1, rate.height() - 2)
        ...
};
The timing trend should likely apply the same pattern. Without it, the trend still includes data from potentially incomplete first/last windows.
Suggested fix (mirroring the `standard_rate` approach)
         .agg([col(column).mean().alias("mean")])
         .collect()
         .context("Windowed mean")?
-        .column("mean")?
+        ;
+
+    let trend_frame = trend_frame.column("mean")?;
+
+    // Slice to drop the first and last because they're likely to be partially filled windows.
+    let trend_col = if trend_frame.len() <= 2 {
+        // Not enough windows to drop first and last
+        &trend_frame.slice(0, 0)
+    } else {
+        &trend_frame.slice(1, trend_frame.len() - 2)
+    };
+
+    let trend = trend_col
         .f64()?
         .iter()
         .filter_map(|v| v.map(|v| round_to_n_dp(v, 6)))
         .collect_vec();

🧹 Nitpick comments (3)

summariser/test_data/3_summary_outputs/zero_arc_create_data_validated-52a0ce7a16521345d301ed0750d079b8513ad48d3f49524ceb725ae0770cd9bc.json (1)

319-343: Unrelated holochain_p2p_metrics restructuring bundled with the timing-window fix.

The structural changes here (adding empty request_roundtrip_duration/handle_incoming_request_duration objects, new fields like remote_signal, publish_counter_sign, countersigning_session_negotiation) appear unrelated to the PR's stated goal of excluding first/last windows from timing stats. Bundling unrelated changes in the same PR makes it harder to bisect regressions. Consider whether these belong in a separate commit or PR, or at minimum document them in the PR description.

summariser/test_data/3_summary_outputs/mixed_arc_must_get_agent_activity-c965934f5262aa6a6442a7d3a23306c8497ee785d416d8269650be33db67911c.json (1)

378-401: holochain_p2p_metrics schema restructuring appears unrelated to the windowing fix.

The addition of request_roundtrip_duration, handle_incoming_request_duration, new fields in request_count (e.g., count_links, send_validation_receipts), and the new handle_incoming_request_count block look like a separate schema change. If this came from regenerating test fixtures against a newer version of the input data or dependencies, it's fine — but it might be worth noting in the PR description to avoid confusion for reviewers.

summariser/test_data/3_summary_outputs/mixed_arc_get_agent_activity-d874c654d75fdc83f5f7c0d7eb2d6be8cfcaf2cc0f5abbc281ad939397351727.json (1)

592-617: holochain_p2p_metrics schema restructuring appears unrelated to the PR's stated objective.

The PR title and description focus on excluding first/last windows from timing stats, but this section introduces structural changes to the holochain_p2p_metrics shape: new top-level keys (request_roundtrip_duration, handle_incoming_request_duration), new fields in handle_incoming_request_count (response, remote_signal, publish_counter_sign, countersigning_session_negotiation), and field reordering. If this is an intentional bundled change (e.g., from an upstream Holochain update), it may be worth noting in the PR description for reviewer clarity.

mattyg requested a review from a team February 10, 2026 21:52

jost-s previously approved these changes Feb 11, 2026

View reviewed changes

ThetaSinner requested changes Feb 13, 2026

View reviewed changes

mattyg dismissed jost-s’s stale review via c4dedac February 13, 2026 17:18

mattyg force-pushed the fix/exclude-trailing-windows branch from 3cfe187 to c4dedac Compare February 13, 2026 17:18

coderabbitai bot reviewed Feb 13, 2026

View reviewed changes

feat: do not exclude first and last element from each window in stand…

81a07aa

…ard_timing_stats

mattyg force-pushed the fix/exclude-trailing-windows branch from c4dedac to 81a07aa Compare February 13, 2026 17:24

mattyg requested a review from ThetaSinner February 13, 2026 17:26

coderabbitai bot reviewed Feb 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: exclude first and last window from timing stats, not the first and last element in each window#531

fix: exclude first and last window from timing stats, not the first and last element in each window#531
mattyg wants to merge 1 commit intomainfrom
fix/exclude-trailing-windows

mattyg commented Feb 10, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 10, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 10, 2026 •

edited

Loading

Uh oh!

ThetaSinner left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot Feb 13, 2026

Uh oh!

cocogitto-bot bot commented Feb 13, 2026

Uh oh!

mattyg commented Feb 13, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mattyg commented Feb 10, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

TODO:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Uh oh!

github-actions bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

[0.6.1] - 2026-02-13

Features

Uh oh!

ThetaSinner left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

cocogitto-bot bot commented Feb 13, 2026

Uh oh!

mattyg commented Feb 13, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mattyg commented Feb 10, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 10, 2026 •

edited

Loading

github-actions bot commented Feb 10, 2026 •

edited

Loading