Redesign local gateway setup as out-of-process SetupEngine#529
Redesign local gateway setup as out-of-process SetupEngine#529ranjeshj wants to merge 26 commits into
Conversation
|
Codex review: needs real behavior proof before merge. Reviewed May 24, 2026, 8:16 PM ET / 00:16 UTC. Summary Reproducibility: yes. for the review findings by source inspection: PR head still uses Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance: Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge Security Review findings
Review detailsBest possible solution: Keep the PR open, fix the line-level blockers, then require maintainer architecture review plus redacted fresh-install, reconnect, and uninstall proof before merge. Do we have a high-confidence way to reproduce the issue? Yes for the review findings by source inspection: PR head still uses Is this the best way to solve the issue? No. The out-of-process SetupEngine direction may be viable, but this branch needs the supported tray route, localization parity, shell hardening, and real upgrade/fresh-install proof before it is the maintainable replacement. Full review comments:
Overall correctness: patch is incorrect Codex review notes: model gpt-5.5, reasoning high; reviewed against ef6ac8acbab2. Label changesLabel changes:
Label justifications:
Evidence reviewedSecurity concerns:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
New headless setup engine that installs WSL, configures OpenClaw gateway, pairs operator/node connections, and verifies end-to-end connectivity. - Transactional pipeline with retry, rollback, and crash-recovery journal - Structured JSONL logging with secret redaction - v2 signature fix for local gateways in GatewayConnectionManager - All 15 steps working E2E in headless mode Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Configure gateway with bind=lan (0.0.0.0) instead of loopback to avoid unreliable WSL2 localhost port forwarding - Add StartKeepaliveStep (step 16) to keep distro alive after setup completes, ensuring tray connects instantly on launch - Add default-config.json and design doc Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eway config - Expand SetupConfig with nested WslConfig, GatewayConfig, CapabilitiesConfig, TraySettingsConfig, and PairingConfig — zero hardcoded values - Register node capabilities (stub INodeCapability) before ConnectAsync so gateway stores caps/commands from hello message - Write settings.json after node pairing (EnableNodeMode=true + cap toggles) using merge logic that preserves existing user settings - Make WSL wsl.conf generation config-driven (user, systemd, interop, etc.) - Make gateway config-driven (bind mode, auth, health timeout, extra config) - Write keepalive marker file to prevent tray duplicate keepalive - Add fully commented default-config.json with all configurable properties Verified: clean build (0 errors/0 warnings), full E2E run in 118s, tray auto-connects with capabilities registered. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds DrainPendingApprovalsAsync to VerifyEndToEndStep that iteratively approves any remaining pending device or node pairing requests. This ensures the tray launches with zero 'Pairing approval pending' badges. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Standalone unpackaged WinUI app (no FunctionalUI dependency) with: - Welcome page with lobster icon, info card, V2 text strings - Capabilities page with 2-column grid, icons, descriptions, toggles - Progress page with step groups, badges (spinner/check/error), log viewer - Complete page with success/error state, launch tray button Features: - Mica backdrop + extended title bar - DPI-aware window sizing (720x700 logical) - UAC manifest (asInvoker) to avoid elevation prompt - --headless bypass for automation - Config-driven defaults from SetupConfig Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Welcome: flex layout, full V2 text, 'Install new WSL Gateway' button, confirmation dialog, 'Advanced setup' link (opens tray connection page) - Title bar: lobster icon + 'OpenClaw Setup' text, 36px height - Permissions page: 5 rows (notifications, camera, mic, location, screen), live status checks, 'Open Settings' buttons, 'Refresh status' - Complete page: party popper, amber Node Mode banner, startup toggle, 'Finish' button with tray launch + optional registry startup entry - Window height increased to 820px for better step row spacing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove all hardcoded defaults from SetupConfig (DistroName, GatewayPort, BaseDistro, etc.) - Both UI and headless exe now require a config file to run - UI auto-loads bundled default-config.json from AppContext.BaseDirectory - Headless Program.cs exits with error if no config found - Added Content Include in csproj to bundle default-config.json with exe Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- CleanupStaleDistroStep: wsl --shutdown + delete orphaned VHD directory - CompletePage: show error message + 'View full log' link on failure - CompletePage: kill old tray, launch via openclaw://chat protocol - Update SETUP_ENGINE_REDESIGN.md to reflect current implementation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- add interactive gateway wizard rendering for SetupEngine UI - make wizard messages render links and device codes inline - refine progress/log layout and setup failure visuals - fix wizard retry/skip behavior and credential precedence - harden WSL cleanup, base distro reuse, and missing WSL handling Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add RollbackAsync to 6 SetupEngine steps for --uninstall support - Add UninstallAsync to SetupPipeline (reverse rollback execution) - Add --uninstall, --confirm-destructive, --json-output CLI flags - Replace ShowOnboardingAsync with out-of-process SetupEngine.UI launch - Rewrite CliUninstallHandler and SettingsPage uninstall to use SetupEngine - Delete LocalGatewaySetup (4000+ lines), Onboarding, OnboardingV2 projects - Extract GatewayConnectorInterfaces and WslCommandRunner from deleted code - Fix StartupSetupState to scan per-gateway dirs for device tokens - Simplify WSL keepalive to direct wsl process spawn - Add SetupEngine to build.ps1 with post-build copy to WinUI output - Set production defaults: DistroName=OpenClawGateway, GatewayPort=18789 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Create tests/OpenClaw.E2ETests/ project with end-to-end tests that exercise the full setup pipeline headless via SetupEngine CLI, spawn the tray app, and verify operator+node connectivity through MCP app.status/app.nodes calls. - E2ESetupFixture: runs Program.Main() headless, patches settings for MCP, spawns tray process, polls connection status, cleans up via uninstall - SetupAndConnectTests: verifies connected state and node capabilities - McpClient: JSON-RPC client for MCP HTTP server verification - CI workflow: parallel e2e job with test artifact upload for debugging - SetupEngine.csproj: add RuntimeIdentifiers for RID-specific test builds Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Hanselman adversarial review (Opus + Codex) identified 18 issues. Fixed 12 reliability bugs across the setup engine: HIGH consensus (both models): - RetryExecutor: wrap action() in try/catch to prevent pipeline crashes - CommandRunner: catch Win32Exception on process.Start() - ConfigureGatewayStep: shell-escape ExtraConfig values Verified LOW consensus: - TryResetReloadModeAsync: use CancellationToken.None in finally - TransactionJournal: catch IOException on writes - CleanupStaleGatewayStep: delete setup-state.json from both AppData and LocalAppData - AutoApprove: fall back to BootstrapToken when SharedGatewayToken is null - TrayArtifactCleanup: protect DeleteFileIfExists with try/catch - StartKeepaliveStep: remove unused stdout/stderr redirect - PairOperatorStep: unsubscribe DeviceTokenReceived handler - Program: run TrayArtifactCleanup on Cancelled outcome - CleanupStaleDistroStep: retry VHD directory deletion with backoff Added 66 unit tests in new OpenClaw.SetupEngine.Tests project: - RetryExecutorTests (11), SetupPipelineTests (14), TransactionJournalTests (9), SetupLoggerTests (7), SetupConfigTests (18), SetupContextTests (7) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ceace09 to
328419f
Compare
|
ClawSweeper PR egg 🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat. Where did the egg go?
|
- Replace OnboardingV2.Tests build/run steps with SetupEngine.Tests (66 unit tests) - Remove empty OnClawTray.OnboardingV2.Tests project (superseded by SetupEngine) - Drop --no-restore from E2E build step to fix RID-mismatch NETSDK1004 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add -r win-x64 to SetupEngine.Tests build (allows implicit restore for RID-specific deps) - Add -r win-x64 to SetupEngine.Tests run step - Rename e2e job to e2etests for clarity Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…n, release packaging - CleanupStaleGatewayStep: preserve SSH-tunneled and non-local gateway records instead of deleting by URL match alone - InstallCliStep: validate HTTPS scheme, shell-quote URL, add --proto '=https' --tlsv1.2 to curl - ci.yml: publish SetupEngine.UI into release package - Add 8 unit tests covering gateway preservation and URL validation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…iagnostics - Default gateway bind from 'lan' (0.0.0.0) to 'loopback' (127.0.0.1) - Add ValidateWslLockdownStep: verify user, dirs, ownership after configure - Replace --token argv with OPENCLAW_GATEWAY_TOKEN env var (9 call sites) - Add ExistingConfigDetector for dynamic replacement dialog on WelcomePage - Remove unimplemented OperatorScopes/NodeScopes/CliScopes from PairingConfig - Add port conflict detection (ss -tlnp) and improved failure diagnostics - Add RedactTokens helper for log sanitization - Default SkipPermissions to false, add fallback tray exe path - Fix docs drift: step count, default claims - Add UI step group mappings for validate-wsl-lockdown and run-wizard - 279 new lines of unit tests (lockdown, bind validation, token redaction, etc.) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add setup run locking, append-only recovery journals, atomic persistence, bounded command output and rollback handling, UI cancellation/error guards, wizard loop bounds, isolated local data support, and expanded token redaction. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Ensure setup runs privileged WSL configuration as root so imported base distros with non-root defaults still configure correctly. Align local AppData override handling across SetupEngine, tray setup detection, keepalive, and e2e isolation, and accept numeric setup-state phases written by the new engine. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Pushed a follow-up fix commit: What changed:
Local validation on ARM64:
|
Serialize SetupEngine tests that mutate process-wide environment variables and make the E2E fixture wait until app.nodes reports an online node with capabilities before allowing tests to proceed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Focus SetupEngine.UI when launched from the tray and briefly make the setup window topmost so user-initiated setup actions are visible. Reuse the tray icon for the setup app so both companion surfaces share branding. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Prefer OPENCLAW_TRAY_DATA_DIR for tray identity checks in isolated runs so startup setup detection sees the per-gateway device tokens written by SetupEngine and does not relaunch setup after a successful install. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Mark cron loading complete when the gateway returns an empty jobs list, and place the Event Stream empty state in the main content row so the page does not look blank when no agent events have arrived yet. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove an invalid x:Uid from the Event Stream clear-button TextBlock. The matching resource includes a Content property, which TextBlock does not support, causing AgentEventsPage to throw during XAML load and making navigation appear to do nothing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Redesign local gateway setup as out-of-process SetupEngine
Problem
The previous onboarding/setup flow was tightly coupled to the tray app, making it hard to test, debug, and maintain. The in-process
LocalGatewaySetupclass was ~4000 lines with mixed concerns (WSL management, gateway config, UI state, approval flows).Solution
Replace the monolithic in-process setup with a standalone SetupEngine — a config-driven, transactional pipeline that runs as a separate process (
SetupEngine.UI). The tray app launches it and receives results via deep-link protocol.Architecture
OpenClaw.SetupEngine— CLI/library with 17 pipeline steps, transaction journal for rollback, retry executor, and structured loggingOpenClaw.SetupEngine.UI— WinUI3 app with fluent wizard flow (Welcome → Capabilities → Permissions → Progress → Complete)default-config.jsondefines capabilities, WSL distro settings, gateway config; no hardcoded defaultsKey changes
New projects:
src/OpenClaw.SetupEngine/— pipeline engine (SetupPipeline, SetupSteps, CommandRunner, RetryExecutor, TransactionJournal)src/OpenClaw.SetupEngine.UI/— WinUI wizard (6 pages, launched by tray)tests/OpenClaw.SetupEngine.Tests/— 66 unit teststests/OpenClaw.E2ETests/— end-to-end setup test via MCPRemoved (~10,400 lines):
src/OpenClawTray.OnboardingV2/— old V2 onboarding appsrc/OpenClaw.Tray.WinUI/Onboarding/— old in-process wizard, services, flow controllersrc/OpenClaw.Tray.WinUI/Services/LocalGatewaySetup/— monolithic setup (~5,800 lines)tests/OpenClawTray.OnboardingV2.Tests/— empty placeholder projectModified:
src/OpenClaw.Tray.WinUI/App.xaml.cs— launch SetupEngine.UI instead of in-process onboardingsrc/OpenClaw.Tray.WinUI/Services/StartupSetupState.cs— detect setup-state from SetupEngine.github/workflows/ci.yml— add SetupEngine tests, E2E job, replace OnboardingV2Reliability hardening (Hanselman dual-model review)
Adversarial review with Claude Opus + GPT Codex identified 18 issues; 12 fixed:
Test coverage
E2E validated 5 consecutive runs, avg 153s, no flakiness.