Feat/dht sync with node volitility by mattyg · Pull Request #535 · holochain/wind-tunnel

mattyg · 2026-02-14T00:46:50Z

Summary

Resolves #443

This copies the write_get_agent_activity scenario, but adds conductor volatility to the get_agent_activity behavior.

TODO:

All code changes are reflected in docs, including module-level docs
All new/edited/removed scenarios are reflected in summary visualiser tool (see checklist)
I ran the Nomad CI workflow successfully on my branch

Note that all commits in a PR must follow Conventional Commits before it can be merged, as these are used to generate the changelog

Summary by CodeRabbit

New Features
- Added new write_get_agent_activity_volatile scenario for load testing with randomized conductor restart cycles
- Enhanced conductor lifecycle management with start, stop, and restart capabilities
- Added comprehensive metrics tracking including zome call timings, conductor restart counts, and activity sequence monitoring
Tests
- Added test case for new scenario with test data fixtures

coderabbitai · 2026-02-14T00:46:59Z

Walkthrough

This PR adds a new write_get_agent_activity_volatile scenario that tests agent activity behavior with conductor volatility. It includes refactored conductor lifecycle APIs (start/stop/restart) in the Holochain bindings, a new scenario crate with volatile behavior, metrics summarisation, and supporting configuration across Nomad workflows and test infrastructure.

Changes

Cohort / File(s)	Summary
Workflow & Configuration `.github/workflows/nomad-common.yaml`, `.github/workflows/test.yaml`, `Cargo.toml`	Added new Nomad job variant `write_get_agent_activity_volatile` to matrix, new smoke test step with 2 agents and volatile behavior, and registered scenario as workspace member.
Holochain Runner Lifecycle Refactoring `bindings/runner/src/common.rs`, `bindings/runner/src/context.rs`, `bindings/runner/src/holochain_runner.rs`	Major refactoring: introduced `create_and_start_holochain_conductor()`, `stop_holochain_conductor()`, and `start_holochain_conductor()` helpers; added `cell_role_name` field to `HolochainAgentContext`; restructured `HolochainRunner` to use explicit config object with `metrics_path` field and separated creation from execution via `create(config)` and `run(&mut self)` methods.
Nomad Scenario Configuration `nomad/job-variants/canonical/vars/write_get_agent_activity_volatile.json`, `nomad/job-variants/demo/vars/write_get_agent_activity_volatile.json`	Added scenario configuration files defining 2 write agents and 8 volatile get_agent_activity agents with 900-second duration for both canonical and demo variants.
New Scenario Implementation `scenarios/write_get_agent_activity_volatile/Cargo.toml`, `scenarios/write_get_agent_activity_volatile/README.md`, `scenarios/write_get_agent_activity_volatile/src/main.rs`	New scenario crate with `ScenarioValues` tracking state, `agent_behaviour_write` for entry creation, and `agent_behaviour_get_agent_activity_volatile` for volatile conductor on/off cycling with random timings (10–30s online, 2–10s offline).
Summariser Module & Metrics `summariser/src/lib.rs`, `summariser/src/scenario.rs`, `summariser/src/scenario/write_get_agent_activity_volatile.rs`	Added new scenario summariser pipeline collecting `highest_observed_action_seq`, zome call timings, startup/shutdown counts, error counts, and Holochain P2P metrics; integrated into scenario reporter dispatcher.
Test Data & Fixtures `summariser/test_data/1_run_summaries/...`, `summariser/test_data/2_query_results/...`, `summariser/test_data/3_summary_outputs/...`, `summariser/tests/snapshot.rs`	Added comprehensive test fixtures: run summary JSON, 30+ query result JSON files with network/CPU/memory/metrics data, summary output snapshot, and new snapshot test function.
Summary Visualiser Template `summary-visualiser/templates/scenarios/write_get_agent_activity_volatile.html.tmpl`	New HTML template for scenario visualization displaying metrics (highest action_seq, zome call timings, startup/shutdown counts, errors, P2P metrics) with conditional rendering blocks.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

PR #333: Modifies bindings/runner startup and Holochain config plumbing for metrics path and conductor lifecycle management, sharing code-level changes with conductor/runner refactoring.
PR #499: Adjusts Holochain runner metrics path propagation and related conductor configuration APIs, directly overlapping with metrics plumbing changes.
PR #395: Adds scenario-specific summariser and template infrastructure for write/get-agent-activity variants, extending the same patterns applied here for the volatile variant.

Suggested reviewers

ThetaSinner
jost-s

🚥 Pre-merge checks | ✅ 6

✅ Passed checks (6 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Feat/dht sync with node volitility' describes the main feature but contains a misspelling ('volitility' should be 'volatility') and uses a generic format that could be more descriptive of the actual change.
Description check	✅ Passed	The PR description includes a summary, references issue `#443`, explains the changes (copying write_get_agent_activity with conductor volatility), and includes the required TODO checklist with all expected sections.
Linked Issues check	✅ Passed	The PR substantially addresses issue `#443` by adding the write_get_agent_activity_volatile scenario with conductor restart/stop functionality and summariser support, though the volatile behavior implementation appears simplified compared to the detailed requirements in the issue.
Out of Scope Changes check	✅ Passed	All changes align with the scope of adding conductor volatility support: new scenario crate, holochain binding updates, workflow configuration, summariser module, and visualiser template. No extraneous changes detected.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/dht-sync-with-node-volitility

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-14T00:47:09Z

The following will be added to the changelog

[0.6.1] - 2026-02-14

Features

Basic summary visualiser for write_get_agent_activity_volatile scenario
Summariser for write_get_agent_activity_volatile scenario
Write_get_agent_activity_volatile scenario
Shutdown and restart conductor functions for holochain binding

CI

Run write_get_agent_activity_volatile scenario in CI

mattyg · 2026-02-14T00:49:05Z

bindings/runner/src/holochain_runner.rs

Split up creating and starting the HolochainRunner into separate functions. Add a function to shutdown the conductor without deleting its root directory.

mattyg · 2026-02-14T00:58:11Z

bindings/runner/src/holochain_runner.rs

 }
+
+#[cfg(test)]
+mod tests {


Unit test that conductor data is deleted on drop, not shutdown.

mattyg · 2026-02-14T00:59:50Z

bindings/runner/src/common.rs

+///     Ok(())
+/// }
+/// ```
+pub fn stop_holochain_conductor<SV: UserValuesConstraint>(


Function to stop holochain conductor, to use from scenarios.

…scenario

cocogitto-bot · 2026-02-14T01:03:27Z

✔️ 77fda38...f3ad2a5 - Conventional commits check succeeded.

mattyg · 2026-02-14T01:03:57Z

bindings/runner/src/common.rs

 /// If starting the conductor fails, attempts to clean up the conductor root directory.
 /// Returns [`HolochainRunner`] on success.
-async fn start_holochain_conductor(
+async fn create_and_start_holochain_conductor(


Renamed for clarity.

mattyg · 2026-02-14T01:04:20Z

bindings/runner/src/common.rs

+/// If using an external conductor via connection-string, this function does nothing.
+pub fn start_holochain_conductor<SV: UserValuesConstraint>(
+    ctx: &mut AgentContext<HolochainRunnerContext, HolochainAgentContext<SV>>,
+) -> WindTunnelResult<()> {


Function to start a previously created conductor, to use from scenarios.

mattyg · 2026-02-14T01:05:25Z

bindings/runner/src/context.rs

 pub struct HolochainAgentContext<T: UserValuesConstraint = DefaultScenarioValues> {
    pub(crate) installed_app_id: Option<String>,
+    /// The RoleName for the cell in the `cell_id` field
+    pub(crate) cell_role_name: Option<RoleName>,


Add the "primary" cell's role name to context, so that we can access to call use_installed_app.

This will need to be if we want to support multiple cells, but I think its fine for now.

mattyg · 2026-02-14T01:06:21Z

bindings/runner/src/holochain_runner.rs

+
+    /// The path where influxive metrics will be written, by setting the
+    /// env variable HOLOCHAIN_INFLUXIVE_FILE for the holochain process.
+    metrics_path: Option<PathBuf>,


Move HOLOCHAIN_INFLUXIVE_FILE config into HolochainRunner config, so it can be applied when the conductor is started.

mattyg · 2026-02-14T01:07:21Z

scenarios/write_get_agent_activity_volatile/src/main.rs

+
+        ctx.get_mut().scenario_values.shutdown_conductor_at = Some(Timestamp::from_micros(
+            Timestamp::now().as_micros() + conductor_on_duration().as_micros() as i64,
+        ));


When the conductor is started by a get_agent_activity_volatile agent, schedule its next shutdown time.

mattyg · 2026-02-14T01:08:24Z

scenarios/write_get_agent_activity_volatile/src/main.rs

+    let reporter = ctx.runner_context().reporter();
+
+    if let Some(shutdown_at) = ctx.get().scenario_values.shutdown_conductor_at
+        && Timestamp::now() >= shutdown_at


If the scheduled shutdown time has passed, shutdown the conductor and sleep for a randomly selected time. Then start the conductor and schedule its next shutdown time.

mattyg · 2026-02-14T01:10:55Z

summary-visualiser/templates/scenarios/write_get_agent_activity_volatile.html.tmpl

+        {{ template "scenario_metric" (dict
+            "name" "Startups"
+            "description" "The number of conductor startups by <code>get_agent_activity_volatile</code> peers."
+            "content" (tmpl.Exec "scalar" (dict "value" .startups.end "label" "Total" "unit" "startups"))


This is not very meaningful information. It just shows the total number of startups (and shutdown below). Is there some preferences on how this should be summarised and displayed?

Do we want to graph when individual agents are running? Or the total number of running agents in a single graph?

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

bindings/runner/src/common.rs (1)
327-329: ⚠️ Potential issue | 🟠 Major

use_installed_app doesn't set cell_role_name, unlike install_app.

install_app (line 239) sets ctx.get_mut().cell_role_name = Some(role_name.clone()), but use_installed_app does not, despite receiving role_name as a parameter. This means calling use_installed_app standalone (outside of start_holochain_conductor) leaves cell_role_name as None if install_app was never called, causing a panic on cell_role_name().
Proposed fix
     ctx.get_mut().installed_app_id = Some(installed_app_id);
+    ctx.get_mut().cell_role_name = Some(role_name.clone());
     ctx.get_mut().cell_id = Some(cell_id);
     ctx.get_mut().app_client = Some(app_client);

🤖 Fix all issues with AI agents

In @.github/workflows/test.yaml:
- Around line 238-244: The workflow step named "Smoke test -
write_get_agent_activity_volatile" uses the wrong behaviour flag; replace the
argument "--behaviour get_agent_activity:1" with "--behaviour
get_agent_activity_volatile:1" so the smoke test runs the volatile behaviour
registered by the write_get_agent_activity_volatile scenario (refer to the
scenario name write_get_agent_activity_volatile and the behaviour identifier
get_agent_activity_volatile in the scenario code).

In `@bindings/runner/src/common.rs`:
- Around line 889-920: The doc comment for stop_holochain_conductor is
inaccurate: it does not drop the HolochainRunner or clean up directories but
instead takes the runner, calls shutdown() (which only drops the Child handle)
and then reassigns the runner back into ctx.get_mut().holochain_runner; update
the documentation on stop_holochain_conductor to state that it invokes
HolochainRunner::shutdown(), leaves the HolochainRunner instance in the context,
and does not remove on-disk directories (only Drop for HolochainRunner would do
that), or alternatively, if the intended behavior is to fully stop and remove
the runner, change the implementation to not reinsert the runner into
ctx.get_mut().holochain_runner so the HolochainRunner is dropped and its Drop
runs. Ensure references to HolochainRunner, shutdown(), and
ctx.get_mut().holochain_runner are clear in the updated comment or code change.

In `@bindings/runner/src/holochain_runner.rs`:
- Around line 174-198: In the create function, fix the typo in the error message
used in with_context for fs::create_dir_all: replace the string "Failed to
create cChonductor root directory '{}'" with the correct "Failed to create
conductor root directory '{}'" (reference symbols: create,
config.conductor_root_path).

In `@nomad/job-variants/canonical/vars/write_get_agent_activity_volatile.json`:
- Around line 1-14: The workflow currently reads .behaviours which doesn't exist
in these vars files, causing REQUIRED_NODES to default to 1; change the jq
expression in nomad-common.yaml (where it uses ".behaviours // 1 | length") to
instead read .assignments and sum each assignment's nodes (e.g., map(.nodes) |
add or equivalent) so REQUIRED_NODES reflects the total nodes (2 + 8 = 10 for
this file); update the expression that computes REQUIRED_NODES to reference
.assignments and aggregate the nodes field rather than using length on
behaviours.

In `@scenarios/write_get_agent_activity_volatile/README.md`:
- Line 5: Update the README sentence in the write_get_agent_activity_volatile
scenario to correct the typos: change "restsart" to "restart" and "semi random"
to "semi-random" so the line reads something like "A scenario where 'write'
peers create entries, while 'get_agent_activity_volatile' peers each query a
single 'write' agent's activity with `get_agent_activity` but shutdown and
restart their conductors at semi-random intervals." Reference the scenario
identifier write_get_agent_activity_volatile and the sentence containing
"get_agent_activity" to locate the text to edit.

In `@summariser/src/scenario/write_get_agent_activity_volatile.rs`:
- Line 81: The context string for the shutdowns metric is incorrect: replace the
misleading "Startup stats" message used on the shutdowns field so the error
context matches the metric; locate the shutdowns entry where
counter_stats(shutdowns, "value").context("Startup stats")? is called and change
the context to a correct message like "Shutdown stats" (or similarly
descriptive) so counter_stats and the shutdowns field report the proper context
on error.

🧹 Nitpick comments (9)

summariser/test_data/2_query_results/1ef6939655a0e1f28aaf75210edc117329ea642d5a3c63d01cd920209e15bbf8.json (1)

1-1: LGTM! Test data looks valid.

The JSON fixture correctly represents time-series network interface statistics with consistent timestamps and monotonically increasing packet counts. The single-line format likely matches actual query results, which is appropriate for test data.

Minor note: Consider adding a trailing newline at the end of the file per text file conventions, though this doesn't affect functionality.
summariser/test_data/2_query_results/b69f220c005ebe7196845718f6f8ad954d1c42d9c8202fc831af4e680e153c5d.json (1)
1-1: Consider adding a trailing newline and pretty-printing the JSON.

The JSON is valid and the data looks reasonable for testing the volatile scenario. Two optional improvements for maintainability:

Missing trailing newline: Adding a newline at the end of the file follows Unix conventions and prevents git warnings.

Single-line formatting: Pretty-printing the JSON would make diffs more readable when this test data changes in the future.

Since this is test data, these are minor quality-of-life improvements rather than functional issues.
📝 Example with formatting improvements
-[{"time":"2026-02-13 22:40:30","available_percent":24.682858245561675},{"time":"2026-02-13 22:40:40","available_percent":24.18100374370668},{"time":"2026-02-13 22:40:50","available_percent":25.30368196018231},{"time":"2026-02-13 22:41:00","available_percent":25.372369050207016},{"time":"2026-02-13 22:41:10","available_percent":25.222390256199624},{"time":"2026-02-13 22:41:20","available_percent":24.729261072168047}]
+[
+  {"time":"2026-02-13 22:40:30","available_percent":24.682858245561675},
+  {"time":"2026-02-13 22:40:40","available_percent":24.18100374370668},
+  {"time":"2026-02-13 22:40:50","available_percent":25.30368196018231},
+  {"time":"2026-02-13 22:41:00","available_percent":25.372369050207016},
+  {"time":"2026-02-13 22:41:10","available_percent":25.222390256199624},
+  {"time":"2026-02-13 22:41:20","available_percent":24.729261072168047}
+]
scenarios/write_get_agent_activity_volatile/src/main.rs (3)
106-114: Silently discarding the result of execute_in_place.

If the executor fails, the error is swallowed by let _ = .... Consider at least logging the error so conductor lifecycle issues during the off-period are diagnosable.
♻️ Proposed fix
-        let _ = ctx
+        if let Err(e) = ctx
             .runner_context()
             .executor()
             .execute_in_place(async move {
                 tokio::time::sleep(off_duration).await;
                 Ok(())
-            });
+            }) {
+            log::warn!("Error during conductor off-period sleep: {e:?}");
+        }
159-161: Unnecessary .clone() on an already owned value.

write_peer is owned via the if let Some(write_peer) binding; the clone is redundant.
♻️ Proposed fix
-            if let Some(write_peer) = maybe_write_peer {
-                ctx.get_mut().scenario_values.write_peer = Some(write_peer.clone());
+            if let Some(write_peer) = maybe_write_peer {
+                ctx.get_mut().scenario_values.write_peer = Some(write_peer);
             }
14-19: Simplify random selection: use random_range() instead of collecting into a Vec.

In rand 0.9+, rng.random_range(range) samples directly from a range without materializing it. The current implementation unnecessarily collects the entire range into a vector.
♻️ Suggested change
fn choose_random_duration_in_range(range: Range<u64>) -> Duration {
    let mut rng = rand::rng();
-   let options: Vec<u64> = range.collect();
-   let choice = options.choose(&mut rng).unwrap();
-   Duration::from_secs(*choice)
+   Duration::from_secs(rng.random_range(range))
}
summary-visualiser/templates/scenarios/write_get_agent_activity_volatile.html.tmpl (1)

13-46: Minor: the <div class="scenario-metrics"> wrapper renders even when .scenario_metrics is absent.

The {{ with }} guard is inside the div, so an empty <div class="scenario-metrics"></div> is emitted when there are no metrics. Consider moving the div inside the {{ with }} block for cleanliness, or leave as-is if this is the established pattern.
summariser/src/scenario/write_get_agent_activity_volatile.rs (1)
55-62: Unnecessary .clone() before .lazy().

The DataFrame returned from the await? is already owned. The .clone() on Line 59 is redundant.
♻️ Proposed fix
     let get_agent_activity_full_zome_calls =
         query::query_zome_call_instrument_data(client.clone(), &summary)
             .await
             .context("Load get_agent_activity_full zome call data")?
-            .clone()
             .lazy()
             .filter(col("fn_name").eq(lit("get_agent_activity_full")))
             .collect()?;
bindings/runner/src/common.rs (1)

926-961: Log messages say "Restarting" but the function is start_holochain_conductor.

Lines 930 and 934 log "Restarting Holochain conductor" but the public function is named start_holochain_conductor. Consider aligning the log messages with the function name for clarity when debugging, or renaming the function to restart_holochain_conductor since it also reconfigures URLs and reconnects the app (lines 951–953).

bindings/runner/src/holochain_runner.rs (1)

270-275: shutdown() relies on kill_on_drop(true) — the conductor is killed, not gracefully shut down.

shutdown() sets holochain_handle = None, dropping the Child process. Since the process is spawned with .kill_on_drop(true) (line 215), this sends a SIGKILL (on Unix). The doc comment says "gracefully shut down" in stop_holochain_conductor, but SIGKILL is not graceful — the conductor gets no chance to flush state.

For a volatility scenario simulating node crashes, this may be intentional. But if you intend a graceful shutdown in normal teardown, consider sending SIGTERM or writing a shutdown command to stdin before dropping.

coderabbitai · 2026-02-14T01:20:19Z

.github/workflows/test.yaml

+      - name: Smoke test - write_get_agent_activity_volatile
+        run: |
+          MIN_AGENTS=2 nix run .#rust-smoke-test -- --package write_get_agent_activity_volatile -- --agents 2 --behaviour write:1 --behaviour get_agent_activity:1 --duration 5 --no-progress
+
+          echo "==> Available space after step"
+          echo
+          df -h


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find the scenario directories echo "=== Looking for write_get_agent_activity_volatile scenario ===" fd --type d "write_get_agent_activity_volatile" --max-depth 3 echo "" echo "=== Looking for write_get_agent_activity scenario (non-volatile) ===" fd --type d "write_get_agent_activity" --max-depth 3 | grep -v volatile

Repository: holochain/wind-tunnel

Length of output: 278

🏁 Script executed:

#!/bin/bash # Check the main.rs files in both scenarios echo "=== Behaviours in write_get_agent_activity_volatile/main.rs ===" if [ -f "scenarios/write_get_agent_activity_volatile/main.rs" ]; then cat scenarios/write_get_agent_activity_volatile/main.rs else echo "File not found, searching..." find . -path "*/write_get_agent_activity_volatile/main.rs" -exec cat {} \; fi

Repository: holochain/wind-tunnel

Length of output: 157

🏁 Script executed:

#!/bin/bash # Check the non-volatile version for comparison echo "=== Behaviours in write_get_agent_activity/main.rs (non-volatile) ===" if [ -f "scenarios/write_get_agent_activity/main.rs" ]; then cat scenarios/write_get_agent_activity/main.rs else echo "File not found, searching..." find . -path "*/write_get_agent_activity/main.rs" -exec cat {} \; fi

Repository: holochain/wind-tunnel

Length of output: 163

🏁 Script executed:

#!/bin/bash # Also check the workflow file context around those lines echo "=== Workflow file lines 230-250 ===" sed -n '230,250p' .github/workflows/test.yaml

Repository: holochain/wind-tunnel

Length of output: 1090

🏁 Script executed:

#!/bin/bash # List the structure of both scenarios echo "=== write_get_agent_activity_volatile directory structure ===" find scenarios/write_get_agent_activity_volatile -type f | head -20 echo "" echo "=== write_get_agent_activity directory structure ===" find scenarios/write_get_agent_activity -type f | head -20

Repository: holochain/wind-tunnel

Length of output: 483

🏁 Script executed:

#!/bin/bash # Search for where behaviour names are defined/registered echo "=== Search for 'get_agent_activity_volatile' string in codebase ===" rg "get_agent_activity_volatile" --type rust --type yaml echo "" echo "=== Search for behaviour registration patterns in scenarios ===" rg "behaviour.*write" scenarios/ --type rust -A 2 -B 2 | head -50

Repository: holochain/wind-tunnel

Length of output: 7724

🏁 Script executed:

#!/bin/bash # Look for the actual behaviour implementation names in rust files echo "=== Looking for behaviour struct/impl definitions ===" fd --type f "\.(rs|yaml)$" scenarios/ --exec grep -l "get_agent_activity" {} \;

Repository: holochain/wind-tunnel

Length of output: 390

🏁 Script executed:

#!/bin/bash # View the volatile scenario main.rs to see behaviour registration echo "=== write_get_agent_activity_volatile/src/main.rs (behaviour registrations) ===" grep -n "use_named_agent_behaviour\|assigned_behaviour" scenarios/write_get_agent_activity_volatile/src/main.rs echo "" echo "=== write_get_agent_activity/src/main.rs (behaviour registrations) ===" grep -n "use_named_agent_behaviour\|assigned_behaviour" scenarios/write_get_agent_activity/src/main.rs

Repository: holochain/wind-tunnel

Length of output: 643

Behaviour name incorrect — should be get_agent_activity_volatile:1.

The smoke test uses --behaviour get_agent_activity:1, but the write_get_agent_activity_volatile scenario registers the behaviour as get_agent_activity_volatile (see scenarios/write_get_agent_activity_volatile/src/main.rs line 177). This is a copy-paste from the write_get_agent_activity smoke test (line 232) without updating the behaviour name. The test will fail to run the intended volatile behaviour.

Proposed fix

- name: Smoke test - write_get_agent_activity_volatile run: | - MIN_AGENTS=2 nix run .#rust-smoke-test -- --package write_get_agent_activity_volatile -- --agents 2 --behaviour write:1 --behaviour get_agent_activity:1 --duration 5 --no-progress + MIN_AGENTS=2 nix run .#rust-smoke-test -- --package write_get_agent_activity_volatile -- --agents 2 --behaviour write:1 --behaviour get_agent_activity_volatile:1 --duration 5 --no-progress

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: Smoke test - write_get_agent_activity_volatile

run: |

MIN_AGENTS=2 nix run .#rust-smoke-test -- --package write_get_agent_activity_volatile -- --agents 2 --behaviour write:1 --behaviour get_agent_activity:1 --duration 5 --no-progress

echo "==> Available space after step"

echo

df -h

- name: Smoke test - write_get_agent_activity_volatile

run: |

MIN_AGENTS=2 nix run .#rust-smoke-test -- --package write_get_agent_activity_volatile -- --agents 2 --behaviour write:1 --behaviour get_agent_activity_volatile:1 --duration 5 --no-progress

echo "==> Available space after step"

echo

df -h

🤖 Prompt for AI Agents

In @.github/workflows/test.yaml around lines 238 - 244, The workflow step named "Smoke test - write_get_agent_activity_volatile" uses the wrong behaviour flag; replace the argument "--behaviour get_agent_activity:1" with "--behaviour get_agent_activity_volatile:1" so the smoke test runs the volatile behaviour registered by the write_get_agent_activity_volatile scenario (refer to the scenario name write_get_agent_activity_volatile and the behaviour identifier get_agent_activity_volatile in the scenario code).

coderabbitai · 2026-02-14T01:20:20Z

bindings/runner/src/common.rs

+/// Stops the Holochain conductor if one is running.
+///
+/// This function will take the `holochain_runner` from the context and drop it, which will
+/// gracefully shut down the conductor process and clean up its directories.
+///
+/// If no conductor is running this function does nothing.
+/// If using an external conductor via connection-string, this function does nothing.
+///
+/// Call this function as follows:
+/// ```rust
+/// use holochain_wind_tunnel_runner::prelude::{HolochainAgentContext, HolochainRunnerContext, stop_holochain_conductor, AgentContext, HookResult};
+///
+/// fn agent_behaviour(ctx: &mut AgentContext<HolochainRunnerContext, HolochainAgentContext>) -> HookResult {
+///     stop_holochain_conductor(ctx)?;
+///     Ok(())
+/// }
+/// ```
+pub fn stop_holochain_conductor<SV: UserValuesConstraint>(
+    ctx: &mut AgentContext<HolochainRunnerContext, HolochainAgentContext<SV>>,
+) -> WindTunnelResult<()> {
+    if let Some(mut runner) = ctx.get_mut().holochain_runner.take() {
+        log::info!("Stopping Holochain conductor");
+        runner.shutdown();
+        log::info!("Holochain conductor stopped");
+
+        ctx.get_mut().holochain_runner = Some(runner);
+    } else {
+        log::debug!("No Holochain conductor is running, so nothing to stop");
+    }
+
+    Ok(())
+}


⚠️ Potential issue | 🟡 Minor

Documentation is misleading: runner is not dropped, and directories are not cleaned up on shutdown.

The doc comment says the function will "take the holochain_runner from the context and drop it, which will gracefully shut down the conductor process and clean up its directories." In practice:

The runner is taken, shutdown() is called (which only drops the Child process handle), and then the runner is put back into the context (line 914).

shutdown() does not clean up directories — only Drop::drop on HolochainRunner does.

This mismatch could confuse callers about post-call state (is the runner gone? are directories deleted?).

Suggested doc fix

-/// This function will take the `holochain_runner` from the context and drop it, which will -/// gracefully shut down the conductor process and clean up its directories. +/// This function will shut down the running conductor process while retaining the +/// [`HolochainRunner`] in the context so it can be restarted later with +/// [`start_holochain_conductor`]. Conductor directories are preserved until the runner is dropped. /// -/// If no conductor is running this function does nothing. -/// If using an external conductor via connection-string, this function does nothing. +/// If no conductor is running, this function does nothing. +/// If using an external conductor via connection-string, this function does nothing.

🤖 Prompt for AI Agents

In `@bindings/runner/src/common.rs` around lines 889 - 920, The doc comment for stop_holochain_conductor is inaccurate: it does not drop the HolochainRunner or clean up directories but instead takes the runner, calls shutdown() (which only drops the Child handle) and then reassigns the runner back into ctx.get_mut().holochain_runner; update the documentation on stop_holochain_conductor to state that it invokes HolochainRunner::shutdown(), leaves the HolochainRunner instance in the context, and does not remove on-disk directories (only Drop for HolochainRunner would do that), or alternatively, if the intended behavior is to fully stop and remove the runner, change the implementation to not reinsert the runner into ctx.get_mut().holochain_runner so the HolochainRunner is dropped and its Drop runs. Ensure references to HolochainRunner, shutdown(), and ctx.get_mut().holochain_runner are clear in the updated comment or code change.

coderabbitai · 2026-02-14T01:20:20Z

bindings/runner/src/holochain_runner.rs

+    /// Create a holochain runner with the provided config, but do not start it.
+    pub fn create(config: &HolochainConfig) -> WindTunnelResult<Self> {
+        fs::create_dir_all(&config.conductor_root_path).with_context(|| {
            format!(
-                "Failed to create conductor root directory '{}'",
-                conductor_root_path.display()
+                "Failed to create cChonductor root directory '{}'",
+                config.conductor_root_path.display()
            )
        })?;
-        let conductor_root_path = conductor_root_path.canonicalize()?;
+        let conductor_root_path = config.conductor_root_path.canonicalize()?;

        log::trace!(
            "Writing conductor config file to '{}'",
            conductor_root_path.display()
        );
-        let conductor_config_path = write_config(
-            config.conductor_config_root_path.clone(),
+        write_config(
+            config.conductor_root_path.clone().into(),
            &config.conductor_config,
        )
-        .context("Failed to write conductor config file")?
-        .to_path_buf();
+        .context("Failed to write conductor config file")?;
+
+        Ok(Self {
+            config: config.clone(),
+            holochain_handle: None,
+        })
+    }


⚠️ Potential issue | 🟡 Minor

Typo in error message: "cChonductor" → "conductor".

Line 178: "Failed to create cChonductor root directory".

Fix

- "Failed to create cChonductor root directory '{}'", + "Failed to create conductor root directory '{}'",

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/// Create a holochain runner with the provided config, but do not start it.

pub fn create(config: &HolochainConfig) -> WindTunnelResult<Self> {

fs::create_dir_all(&config.conductor_root_path).with_context(|| {

format!(

"Failed to create conductor root directory '{}'",

conductor_root_path.display()

"Failed to create cChonductor root directory '{}'",

config.conductor_root_path.display()

)

})?;

let conductor_root_path = conductor_root_path.canonicalize()?;

let conductor_root_path = config.conductor_root_path.canonicalize()?;

log::trace!(

"Writing conductor config file to '{}'",

conductor_root_path.display()

);

let conductor_config_path = write_config(

config.conductor_config_root_path.clone(),

write_config(

config.conductor_root_path.clone().into(),

&config.conductor_config,

)

.context("Failed to write conductor config file")?

.to_path_buf();

.context("Failed to write conductor config file")?;

Ok(Self {

config: config.clone(),

holochain_handle: None,

})

}

/// Create a holochain runner with the provided config, but do not start it.

pub fn create(config: &HolochainConfig) -> WindTunnelResult<Self> {

fs::create_dir_all(&config.conductor_root_path).with_context(|| {

format!(

"Failed to create conductor root directory '{}'",

config.conductor_root_path.display()

)

})?;

let conductor_root_path = config.conductor_root_path.canonicalize()?;

log::trace!(

"Writing conductor config file to '{}'",

conductor_root_path.display()

);

write_config(

config.conductor_root_path.clone().into(),

&config.conductor_config,

)

.context("Failed to write conductor config file")?;

Ok(Self {

config: config.clone(),

holochain_handle: None,

})

}

🤖 Prompt for AI Agents

In `@bindings/runner/src/holochain_runner.rs` around lines 174 - 198, In the create function, fix the typo in the error message used in with_context for fs::create_dir_all: replace the string "Failed to create cChonductor root directory '{}'" with the correct "Failed to create conductor root directory '{}'" (reference symbols: create, config.conductor_root_path).

coderabbitai · 2026-02-14T01:20:20Z

nomad/job-variants/canonical/vars/write_get_agent_activity_volatile.json

+{
+  "scenario_name": "write_get_agent_activity_volatile",
+  "duration": 900,
+  "assignments": [
+    {
+      "behaviour": "write",
+      "nodes": 2
+    },
+    {
+      "behaviour": "get_agent_activity_volatile",
+      "nodes": 8
+    }
+  ]
+}


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check field usage in vars files and workflow parsing logic echo "=== Checking JSON structure in canonical vars files ===" fd -e json . nomad/job-variants/canonical/vars/ 2>/dev/null | head -20 | while read f; do echo "File: $f" jq 'keys' "$f" 2>/dev/null | head -3 echo "" done echo "=== Searching for 'behaviours' field in all vars files ===" rg '"behaviours"' --glob 'nomad/job-variants/*/vars/*.json' 2>/dev/null | head -10 echo "=== Searching for 'assignments' field in all vars files ===" rg '"assignments"' --glob 'nomad/job-variants/*/vars/*.json' 2>/dev/null | head -10 echo "=== Examining nomad-common.yaml around line 86 ===" if [ -f nomad-common.yaml ]; then sed -n '80,95p' nomad-common.yaml elif [ -f .github/workflows/nomad-common.yaml ]; then sed -n '80,95p' .github/workflows/nomad-common.yaml fi

Repository: holochain/wind-tunnel

Length of output: 3927

Fix the workflow to read assignments instead of behaviours field.

The workflow in nomad-common.yaml (line 86) reads .behaviours // 1 | length, but this file uses assignments (along with all other vars files using node assignments). Since .behaviours doesn't exist in any vars file, it defaults to 1, making 1 | length evaluate to 1. This causes REQUIRED_NODES to be 1 instead of the correct sum (e.g., 10 for 2 + 8 nodes in this file), potentially allowing the job to submit before sufficient nodes are available.

Update the jq expression to read .assignments and sum the nodes values appropriately, or align the field naming across all configs.

🤖 Prompt for AI Agents

In `@nomad/job-variants/canonical/vars/write_get_agent_activity_volatile.json` around lines 1 - 14, The workflow currently reads .behaviours which doesn't exist in these vars files, causing REQUIRED_NODES to default to 1; change the jq expression in nomad-common.yaml (where it uses ".behaviours // 1 | length") to instead read .assignments and sum each assignment's nodes (e.g., map(.nodes) | add or equivalent) so REQUIRED_NODES reflects the total nodes (2 + 8 = 10 for this file); update the expression that computes REQUIRED_NODES to reference .assignments and aggregate the nodes field rather than using length on behaviours.

coderabbitai · 2026-02-14T01:20:20Z

scenarios/write_get_agent_activity_volatile/README.md

+
+### Description
+
+A scenario where 'write' peers create entries, while 'get_agent_activity_volatile' peers each query a single 'write' agent's activity with `get_agent_activity` but shutdown and restsart their conductors at semi random intervals.


⚠️ Potential issue | 🟡 Minor

Fix typos: "restsart" → "restart" and "semi random" → "semi-random".

📝 Proposed fix

-A scenario where 'write' peers create entries, while 'get_agent_activity_volatile' peers each query a single 'write' agent's activity with `get_agent_activity` but shutdown and restsart their conductors at semi random intervals. +A scenario where 'write' peers create entries, while 'get_agent_activity_volatile' peers each query a single 'write' agent's activity with `get_agent_activity` but shutdown and restart their conductors at semi-random intervals.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

A scenario where 'write' peers create entries, while 'get_agent_activity_volatile' peers each query a single 'write' agent's activity with `get_agent_activity` but shutdown and restsart their conductors at semi random intervals.

A scenario where 'write' peers create entries, while 'get_agent_activity_volatile' peers each query a single 'write' agent's activity with `get_agent_activity` but shutdown and restart their conductors at semi-random intervals.

🧰 Tools

🪛 LanguageTool

[grammar] ~5-~5: Ensure spelling is correct
Context: ...h get_agent_activity but shutdown and restsart their conductors at semi random interva...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

[grammar] ~5-~5: Use a hyphen to join words.
Context: ...wn and restsart their conductors at semi random intervals. Before a target 'writ...

(QB_NEW_EN_HYPHEN)

🤖 Prompt for AI Agents

In `@scenarios/write_get_agent_activity_volatile/README.md` at line 5, Update the README sentence in the write_get_agent_activity_volatile scenario to correct the typos: change "restsart" to "restart" and "semi random" to "semi-random" so the line reads something like "A scenario where 'write' peers create entries, while 'get_agent_activity_volatile' peers each query a single 'write' agent's activity with `get_agent_activity` but shutdown and restart their conductors at semi-random intervals." Reference the scenario identifier write_get_agent_activity_volatile and the sentence containing "get_agent_activity" to locate the text to edit.

coderabbitai · 2026-02-14T01:20:20Z

summariser/src/scenario/write_get_agent_activity_volatile.rs

+            )
+            .context("Timing stats for zome call get_agent_activity_full")?,
+            startups: counter_stats(startups, "value").context("Startup stats")?,
+            shutdowns: counter_stats(shutdowns, "value").context("Startup stats")?,


⚠️ Potential issue | 🟡 Minor

Copy-paste bug: error context says "Startup stats" but this is for shutdowns.

🐛 Proposed fix

- shutdowns: counter_stats(shutdowns, "value").context("Startup stats")?, + shutdowns: counter_stats(shutdowns, "value").context("Shutdown stats")?,

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

shutdowns: counter_stats(shutdowns, "value").context("Startup stats")?,

shutdowns: counter_stats(shutdowns, "value").context("Shutdown stats")?,

🤖 Prompt for AI Agents

In `@summariser/src/scenario/write_get_agent_activity_volatile.rs` at line 81, The context string for the shutdowns metric is incorrect: replace the misleading "Startup stats" message used on the shutdowns field so the error context matches the metric; locate the shutdowns entry where counter_stats(shutdowns, "value").context("Startup stats")? is called and change the context to a correct message like "Shutdown stats" (or similarly descriptive) so counter_stats and the shutdowns field report the proper context on error.

mattyg commented Feb 14, 2026

View reviewed changes

mattyg force-pushed the feat/dht-sync-with-node-volitility branch from 0a0d767 to 75fed31 Compare February 14, 2026 00:57

mattyg commented Feb 14, 2026

View reviewed changes

mattyg added 5 commits February 13, 2026 18:03

feat: shutdown and restart conductor functions for holochain binding

77fda38

feat: write_get_agent_activity_volatile scenario

17cc9b4

feat: summariser for write_get_agent_activity_volatile scenario

b8c42a5

feat: basic summary visualiser for write_get_agent_activity_volatile …

e10d8fc

…scenario

ci: run write_get_agent_activity_volatile scenario in CI

f3ad2a5

mattyg force-pushed the feat/dht-sync-with-node-volitility branch from 75fed31 to f3ad2a5 Compare February 14, 2026 01:03

mattyg commented Feb 14, 2026

View reviewed changes

mattyg marked this pull request as ready for review February 14, 2026 01:11

mattyg requested a review from a team February 14, 2026 01:11

coderabbitai bot reviewed Feb 14, 2026

View reviewed changes


		### Description

		A scenario where 'write' peers create entries, while 'get_agent_activity_volatile' peers each query a single 'write' agent's activity with `get_agent_activity` but shutdown and restsart their conductors at semi random intervals.

	shutdowns: counter_stats(shutdowns, "value").context("Startup stats")?,
	shutdowns: counter_stats(shutdowns, "value").context("Shutdown stats")?,

Conversation

mattyg commented Feb 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

TODO:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

github-actions bot commented Feb 14, 2026

[0.6.1] - 2026-02-14

Features

CI

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cocogitto-bot bot commented Feb 14, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mattyg commented Feb 14, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 14, 2026 •

edited

Loading