Safari Performance Test integration + Cleanup #2202

afterxleep · 2025-10-14T11:46:50Z

Task/Issue URL: https://app.asana.com/1/137249556945/task/1211159858495831
Tech Design URL:
CC:

Description

Integrates Safari Performance Tests
Outputs Safari test execution logs to macOS console
Automatically calculates required iterations for best results
Allows setting a Maximum number of iterations
Executes Safari Runtime from a /tmp directory so NPM can install dependencies
Better error handing for Node errors
Tries common node and npm paths / Tries to find node installation
Better error handling for manual cancellation errors
Fixes deallocation crashes, and run all tests in clean tabs

Testing Steps

Make sure you have nodejs and Safari Automation Enabled
Go to a Website
Navigate to Debug Menu > Test Site Performance
Select 10 iterations max so test does not take too long
Confirm both DDG and Safari tests run and results are presented correctly.

Impact and Risks

N/A Internal only tool

Notes to Reviewer

PR is a big big as I removed plenty of dead code and refactored some of the perf logic

Internal references:

Definition of Done | Engineering Expectations | Tech Design Template

Note

Adds Safari-based performance testing and a DDG vs Safari comparison workflow with adaptive iterations, robust error handling, JSON export, and UI updates.

Performance Test Core:
- Safari Runner: Enhances SafariTestRunner with temp runtime setup, npm/node discovery, dependency install, stderr capture, detailed/LocalizedError cases, progress parsing, and maxIterations support.
- Site Tester: Refactors SitePerformanceTester to use per-iteration clean tabs, JS warmup, scroll-driven LCP triggers, use webView’s data store, consistency/IQR-based early stop, and richer metric collection/helpers.
- Parsing/Models: Adds SafariResultsParser, BrowserComparisonResults (with significance/winner logic and JSON export), and metric utilities (IQR/median/outlier filtering).
JS Runner (SafariTestRunner JS):
- CLI/Config now accept maxIterations, increase timeouts, add adaptive iterations with consistency checks, warmup navigation, improved logging, and window sizing; WebDriverService tweaks and guidance for Remote Automation.
UI:
- New comparison UI (ComparisonMetricBox) and updated PerformanceTestWindowView to show DDG vs Safari metrics, browser progress, max-iteration picker, error display, and export-to-JSON.
- PerformanceTestViewModel orchestrates DDG test, Safari test (via runner), cleanup/new-tab flow; adds progress/status/error handling.
App Integration:
- MainViewController.testCurrentSitePerformance now passes createNewTab/closeTab to PerformanceTestWindowController for clean-tab runs; removes separate Safari-only path.
- Debug menu label updated to Test Site Performance (DDG vs Safari).
Tests:
- Mocks generate JSON results; unit tests updated for new parsing, errors, and properties.

^{Written by Cursor Bugbot for commit 5bf4b57. This will update automatically on new commits. Configure here.}

github-actions · 2025-10-14T11:48:12Z

	Warnings
⚠️	PR has 2907 lines of added code (excluding Xcode projects and assets). Consider splitting into smaller PRs if possible.

Generated by 🚫 dangerJS against 5bf4b57

Copilot

Pull Request Overview

Integrates Safari performance testing into the internal PerformanceTest tool, adds adaptive iterations, richer error handling, and UI to compare DuckDuckGo vs Safari results.

Adds a Node/WebDriver-based Safari runner with adaptive iteration logic and temp runtime setup
Updates SwiftUI views and view models to display results, comparison metrics, and export to JSON
Refactors error handling and parsing, and extends tests for error cases

Reviewed Changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
macOS/LocalPackages/PerformanceTest/Tests/PerformanceTestTests/SafariPerformanceTestViewModelTests.swift	Updates tests to new properties and adds error case assertions for Safari runner
macOS/LocalPackages/PerformanceTest/Tests/PerformanceTestTests/Mocks/MockSafariTestExecutor.swift	Mock executor now writes a mock JSON result to disk for parsing
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/UI/SafariPerformanceTestWindowView.swift	Replaces completion path with a detailed results view using parsed metrics
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/UI/SafariPerformanceTestViewModel.swift	Refactors test execution to parse JSON results, unifies error handling, and adds stat view selection
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/UI/PerformanceTestWindowView.swift	Adds comparison UI (DDG vs Safari), error banners, and export action
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/UI/PerformanceTestWindowController.swift	Injects tab lifecycle callbacks to create/close clean tabs during tests
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/UI/PerformanceTestViewModel.swift	Orchestrates end-to-end comparison run, including overlays, tab lifecycle, logging, and JSON export
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/UI/PerformanceTestConstants.swift	Updates defaults (iterations, timeout) and adds new stat view keys (IQR)
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/UI/ComparisonMetricBox.swift	New UI component showing side-by-side metrics and significance indication
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/SafariTestRunner/src/services/WebDriverService.js	Improves Safari WebDriver setup and error guidance
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/SafariTestRunner/src/core/TestExecutor.js	Adds warmup navigation and waiting for readiness
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/SafariTestRunner/src/core/PerformanceTestRunner.js	Implements adaptive iterations with consistency checks (IQR/median and P95/P50)
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/SafariTestRunner/src/core/Configuration.js	Adds maxIterations and extends default timeout
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/Models/TestResult.swift	Removes unused display/computed helpers
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/Models/PerformanceMetrics.swift	Removes computed scoring/formatting helpers
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/Models/DetailedPerformanceMetrics.swift	Removes ancillary computed helpers; retains performance scoring
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/Models/BrowserComparisonResults.swift	New model for comparing browsers, computing differences, significance, and JSON export
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/Core/SitePerformanceTester.swift	Runs DDG tests with cache clearing, warmups, adaptive iterations, and outlier filtering
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/Core/SafariTestRunner.swift	Runs Safari Node runner from a temp env, finds node/npm, parses progress, and captures error output
macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/Core/SafariResultsParser.swift	New parser for Safari JSON results into PerformanceTestResults
macOS/DuckDuckGo/Menus/MainMenu.swift	Updates menu title to reflect DDG vs Safari test; removes separate Safari-only option
macOS/DuckDuckGo/MainWindow/MainViewController.swift	Wires performance test window with tab creation/closure for clean runs

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-15T13:57:09Z

...ackages/PerformanceTest/Tests/PerformanceTestTests/SafariPerformanceTestViewModelTests.swift

+    func testErrorDetails_npmInstallFailed_returnsCorrectMessage() async {
+        let url = URL(string: "https://example.com")!
+        let viewModel = SafariPerformanceTestViewModel(url: url, runnerFactory: { url, iterations in
+            let mock = MockSafariTestExecutor(url: url, iterations: iterations)
+            mock.shouldThrowError = SafariTestRunner.RunnerError.npmInstallFailed
+            return mock
+        })
+
+        await viewModel.runTest()
+
+        XCTAssertEqual(viewModel.errorMessage, "Failed to install npm dependencies. Check Console.app for details.")
+        XCTAssertEqual(viewModel.statusText, "Error: npm install failed")


RunnerError.npmInstallFailed no longer exists; the new cases are npmNotFound and dependenciesInstallFailed(String). Update the test to use .dependenciesInstallFailed with a sample error string, and assert against the new message/status returned by errorDetails(for:). For example, expect errorMessage 'Failed to install test dependencies:\n\n' and statusText 'Error: Dependency installation failed'.

Suggested change

func testErrorDetails_npmInstallFailed_returnsCorrectMessage() async {

let url = URL(string: "https://example.com")!

let viewModel = SafariPerformanceTestViewModel(url: url, runnerFactory: { url, iterations in

let mock = MockSafariTestExecutor(url: url, iterations: iterations)

mock.shouldThrowError = SafariTestRunner.RunnerError.npmInstallFailed

return mock

})

await viewModel.runTest()

XCTAssertEqual(viewModel.errorMessage, "Failed to install npm dependencies. Check Console.app for details.")

XCTAssertEqual(viewModel.statusText, "Error: npm install failed")

func testErrorDetails_dependenciesInstallFailed_returnsCorrectMessage() async {

let url = URL(string: "https://example.com")!

let sampleErrorOutput = "npm ERR! some error output"

let viewModel = SafariPerformanceTestViewModel(url: url, runnerFactory: { url, iterations in

let mock = MockSafariTestExecutor(url: url, iterations: iterations)

mock.shouldThrowError = SafariTestRunner.RunnerError.dependenciesInstallFailed(sampleErrorOutput)

return mock

})

await viewModel.runTest()

XCTAssertEqual(viewModel.errorMessage, "Failed to install test dependencies:\n\n\(sampleErrorOutput)")

XCTAssertEqual(viewModel.statusText, "Error: Dependency installation failed")

Copilot · 2025-10-15T13:57:09Z

...ackages/PerformanceTest/Tests/PerformanceTestTests/SafariPerformanceTestViewModelTests.swift

+
+        await viewModel.runTest()
+
+        XCTAssertEqual(viewModel.errorMessage, "npm not found. Please install npm to run Safari performance tests.")


The expected errorMessage doesn't match the ViewModel's mapping. SafariPerformanceTestViewModel now returns 'npm not found. Please install Node.js (which includes npm) to run Safari performance tests.' Update the assertion to match the new text.

Suggested change

XCTAssertEqual(viewModel.errorMessage, "npm not found. Please install npm to run Safari performance tests.")

XCTAssertEqual(viewModel.errorMessage, "npm not found. Please install Node.js (which includes npm) to run Safari performance tests.")

Copilot · 2025-10-15T13:57:10Z

macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/UI/PerformanceTestViewModel.swift

+        logger.debug("Opening fresh tab with original URL to clean up test state")
+        guard let newWebView = await createNewTab() else { return }
+
+        webView = newWebView
+        logger.debug("Created fresh tab, closing old test tab")
+        await closeTab()


This sequence closes the currently selected tab, which (after createNewTab selects it) is the new tab, leaving the old test tab open. Reverse the order or explicitly target the old tab so you close the test tab and keep the fresh one. Example: call closeTab() first, then createNewTab(selected: true) and update webView.

Suggested change

logger.debug("Opening fresh tab with original URL to clean up test state")

guard let newWebView = await createNewTab() else { return }

webView = newWebView

logger.debug("Created fresh tab, closing old test tab")

await closeTab()

logger.debug("Closing old test tab before opening fresh tab to clean up test state")

await closeTab()

guard let newWebView = await createNewTab() else { return }

webView = newWebView

logger.debug("Created fresh tab after closing old test tab")

Copilot · 2025-10-15T13:57:10Z

.../LocalPackages/PerformanceTest/Sources/PerformanceTest/Models/BrowserComparisonResults.swift

+    public func timeBasedWinner(_ duckduckgoValue: Double, _ safariValue: Double, ddgStdDev: Double = 0, safariStdDev: Double = 0) -> BrowserWinner {
+        // Check if the difference is statistically significant
+        let diff = abs(duckduckgoValue - safariValue)
+
+        // If we have standard deviation data, use it
+        if ddgStdDev > 0 || safariStdDev > 0 {
+            // Use proper quadrature for combining standard deviations: sqrt(σ1² + σ2²)
+            let combinedStdDev = sqrt(ddgStdDev * ddgStdDev + safariStdDev * safariStdDev)
+
+            // Use 1.96 multiplier for 95% confidence interval (two-tailed test)
+            let confidenceMargin = 1.96 * combinedStdDev
+
+            // Difference must be greater than confidence margin to be significant
+            if diff < confidenceMargin {
+                return .tie
+            }
+        } else {


The parameters passed here are IQR values (from the UI), but the calculation treats them as standard deviations, which misstates significance. Convert IQR to an approximate standard deviation (σ ≈ IQR/1.349) before combining, or change the comparison to use a threshold derived from IQR directly. Apply the same fix in sizeBasedWinner.

Copilot · 2025-10-15T13:57:10Z

macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/Core/SafariResultsParser.swift

+            loadTimes: loadTimes,
+            detailedMetrics: collectedMetrics,
+            failedAttempts: failedAttempts,
+            iterations: jsonResult.testConfiguration.iterations,


iterations should reflect actual completed iterations (e.g., successfulIterations.count) rather than the configured iterations. Using the configured count can be misleading when adaptive stopping or failures occur.

Suggested change

iterations: jsonResult.testConfiguration.iterations,

iterations: successfulIterations.count,

Copilot · 2025-10-15T13:57:11Z

macOS/LocalPackages/PerformanceTest/Sources/PerformanceTest/Core/SafariTestRunner.swift

+        // Check for consistency metrics
+        if cleanLine.contains("Consistency:") {
+            // Extract just the metrics part
+            if let metricsRange = cleanLine.range(of: "Consistency: ") {
+                let metrics = String(cleanLine[metricsRange.upperBound...])
+                return (nil, metrics)
+            }
+        }


The Node runner logs 'Consistency metrics: ...', not 'Consistency:'. Update the substring check to match the actual log prefix so these lines surface cleanly in status without falling back to the full raw line.

Copilot · 2025-10-15T13:57:11Z

...ocalPackages/PerformanceTest/Sources/PerformanceTest/UI/SafariPerformanceTestViewModel.swift

            progress = Double(iteration) / Double(total)
        }


Progress is being reset to 0 on status-only updates because parseProgressLog emits iteration=0 for non-iteration lines. Only update progress when iteration > 0 to avoid regressions during status messages.

Safari Performance integration

d798293

afterxleep added 10 commits October 14, 2025 14:04

Remove unused methods

4a64ea2

Remove dead code

0cc4150

Reduce complexity for Site Perf tester

936ad48

Fix bug introduced while cleaning up complexity

39e0a85

Use backticks in performance test helper vs 10000 console.log

a38fa0d

Reduce PerformanceTestViewModel Complexity

2fa25d7

Adding delay between tab close/open

548cec6

Update errors

a68fd3b

Defaults to which node for finding nodejs installation

e303dcb

Restart the UI logic after finishing tests

5bf4b57

afterxleep marked this pull request as ready for review October 15, 2025 13:50

afterxleep requested review from SabrinaTardio and Copilot October 15, 2025 13:50

afterxleep changed the title ~~Safari Performance integration~~ Safari Performance Test integration + Cleanup Oct 15, 2025

Copilot AI reviewed Oct 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Safari Performance Test integration + Cleanup #2202

Safari Performance Test integration + Cleanup #2202

afterxleep commented Oct 14, 2025 •

edited by cursor bot

Loading

Uh oh!

github-actions bot commented Oct 14, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 15, 2025

Uh oh!

Copilot AI Oct 15, 2025

Uh oh!

Copilot AI Oct 15, 2025

Uh oh!

Copilot AI Oct 15, 2025

Uh oh!

Copilot AI Oct 15, 2025

Uh oh!

Copilot AI Oct 15, 2025

Uh oh!

Copilot AI Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		await viewModel.runTest()

		XCTAssertEqual(viewModel.errorMessage, "npm not found. Please install npm to run Safari performance tests.")

	iterations: jsonResult.testConfiguration.iterations,
	iterations: successfulIterations.count,

Safari Performance Test integration + Cleanup #2202

Are you sure you want to change the base?

Safari Performance Test integration + Cleanup #2202

Conversation

afterxleep commented Oct 14, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing Steps

Impact and Risks

Notes to Reviewer

Internal references:

Uh oh!

github-actions bot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

afterxleep commented Oct 14, 2025 •

edited by cursor bot

Loading

github-actions bot commented Oct 14, 2025 •

edited

Loading