[Script] Provide regression test script to help benchmark regression in local env #1551

LeiWang1999 · 2025-12-28T14:32:22Z

This pull request introduces a new shell script, run_perf_regression.sh, designed to automate performance regression testing between the current branch and the latest upstream main. The script manages environment setup, virtual environments, and result reporting, making it easier to compare performance metrics across code changes.

Key features of the new performance regression script:

Performance Regression Automation

Adds maint/scripts/run_perf_regression.sh, a comprehensive shell script that automates the process of building and testing both the current branch and upstream main in isolated Python virtual environments, handling environment variables, and generating markdown and plot results for performance comparisons.

Environment and Build Management

Implements logic to safely handle and restore local build artifacts, manage git remotes, and stash uncommitted changes to ensure reproducible results without interfering with the developer's workspace.
Supports skipping builds for either the new or old environment via environment variables, improving efficiency during repeated runs.

Result Reporting

Generates markdown and PNG plot outputs summarizing the regression results, and prints the markdown summary to the console for immediate review.

Summary by CodeRabbit

Tests
- Added performance regression testing infrastructure to detect and monitor performance changes across versions with automated test execution.
Chores
- Updated development environment configuration and tooling to support performance regression testing with isolated environments and detailed result reporting.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…ories by backing them up before installation and restoring them afterward.

…cts, including .perf_regression and build.bak.*

github-actions · 2025-12-28T14:32:31Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

coderabbitai · 2025-12-28T14:32:32Z

📝 Walkthrough

Walkthrough

This PR adds a new performance regression testing workflow by introducing a Bash script that orchestrates building isolated Python environments for old and new versions, running regression tests, and generating results. A .gitignore entry is added to exclude regression test artifacts.

Changes

Cohort / File(s)	Summary
Build Artifact Exclusion `.gitignore`	Adds ignore pattern for `.perf_regression/` directory to exclude performance regression test outputs and artifacts from version control.
Performance Regression Orchestration `maint/scripts/run_perf_regression.sh`	New Bash script that automates a multi-stage performance regression workflow: configures environment defaults, manages git remotes and stashing, preserves build directory, creates isolated Python environments for both upstream main (OLD) and current branch (NEW), runs `test_perf_regression.py` with both versions, generates Markdown results and PNG plots, and handles cleanup and restoration of original state.

Sequence Diagram

sequenceDiagram
    participant User
    participant Script as run_perf_regression.sh
    participant Git
    participant FileSystem as File System
    participant OldEnv as OLD Env<br/>(main)
    participant NewEnv as NEW Env<br/>(current)
    participant TestRunner as test_perf_regression.py

    User->>Script: Execute run_perf_regression.sh
    Script->>FileSystem: Backup build directory
    Script->>Git: Stash uncommitted changes
    Script->>Git: Validate/upgrade remote to upstream
    
    rect rgb(200, 220, 255)
    Note over Script,NewEnv: Build NEW Environment (current branch)
    Script->>FileSystem: Create NEW venv
    Script->>NewEnv: Install requirements-test.txt
    Script->>NewEnv: Install project
    end
    
    rect rgb(200, 220, 255)
    Note over Script,OldEnv: Build OLD Environment (upstream main)
    Script->>Git: Fetch upstream main
    Script->>Git: Checkout upstream main
    Script->>FileSystem: Create OLD venv
    Script->>OldEnv: Install requirements-test.txt
    Script->>OldEnv: Install project from main
    end
    
    rect rgb(220, 240, 220)
    Note over Script,TestRunner: Run Performance Regression Tests
    Script->>TestRunner: Execute test with OLD python path
    TestRunner->>OldEnv: Run tests (baseline)
    TestRunner-->>Script: Return OLD results
    Script->>TestRunner: Execute test with NEW python path
    TestRunner->>NewEnv: Run tests (current)
    TestRunner-->>Script: Return NEW results
    end
    
    Script->>FileSystem: Generate Markdown results
    Script->>FileSystem: Generate PNG plot
    Script->>User: Print results
    
    rect rgb(240, 220, 220)
    Note over Script,FileSystem: Cleanup & Restore
    Script->>Git: Checkout original branch
    Script->>Git: Restore stashed changes
    Script->>Git: Reinitialize submodules
    Script->>FileSystem: Restore backed-up build directory
    Script->>FileSystem: Clean temporary working directory
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

tile-ai/tilelang#1550: Modifies and restructures the perf-regression Python tooling and test_perf_regression.py, which is directly invoked by the new orchestration script in this PR.

Poem

🐰 A script hops in, both spry and keen,
To test the old against the new, pristine!
Virtual worlds both left and right,
Comparing speeds with metrics bright! ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title directly and clearly describes the main change: introducing a regression test script to help benchmark performance regressions in a local environment.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

maint/scripts/run_perf_regression.sh (3)
57-70: Consider non-interactive mode handling.

The read prompt will hang or fail when the script is run non-interactively (e.g., piped input, CI environment without TTY). Consider adding a TTY check or an environment variable to skip the prompt.
🔎 Suggested enhancement
 # Check for uncommitted changes
 if [[ -n "$(git status --porcelain)" ]]; then
     echo "WARNING: You have uncommitted changes. They will be stashed."
-    read -p "Continue? [y/N] " -n 1 -r
-    echo
-    if [[ ! $REPLY =~ ^[Yy]$ ]]; then
-        echo "Aborted."
-        exit 1
+    if [[ -t 0 ]]; then
+        read -p "Continue? [y/N] " -n 1 -r
+        echo
+        if [[ ! $REPLY =~ ^[Yy]$ ]]; then
+            echo "Aborted."
+            exit 1
+        fi
+    else
+        echo "Non-interactive mode: proceeding with stash."
     fi
     STASHED=1
     git stash push -m "perf_regression_temp_stash"
127-131: Document uv as a required dependency.

The script relies on uv (an alternative Python package manager) but doesn't check for its presence or document it as a prerequisite. Users without uv installed will see a confusing error.
🔎 Suggested enhancement: add prerequisite check

Add this check near the beginning of the script (after line 16):
# Check prerequisites
if ! command -v uv &> /dev/null; then
    echo "ERROR: 'uv' is required but not installed."
    echo "Install with: curl -LsSf https://astral.sh/uv/install.sh | sh"
    exit 1
fi
1-13: Consider adding set -o pipefail for robustness.

With only set -e, failures in the middle of a pipeline may go unnoticed (e.g., git remote get-url origin 2>/dev/null || echo ""). Adding pipefail ensures the script exits if any command in a pipeline fails.
🔎 Suggested enhancement
-set -e
+set -euo pipefail
Note: Adding -u (nounset) would require ensuring all variables are initialized before use, which may need additional adjustments.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b6ace13 and e7c5a57.

📒 Files selected for processing (2)

.gitignore
maint/scripts/run_perf_regression.sh

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)

🔇 Additional comments (5)

.gitignore (1)

123-125: LGTM!

The .perf_regression/ ignore entry correctly pairs with the new regression script's working directory, following the existing pattern of section comments.

maint/scripts/run_perf_regression.sh (4)

80-98: LGTM!

The cleanup function is well-structured with proper error suppression for robustness. The trap EXIT ensures cleanup runs on both success and failure, and the build directory restore logic correctly handles the backup scenario.

147-149: LGTM!

The git clean exclusions correctly preserve the work directory (.perf_regression/), cache, and build backup. This ensures aggressive cleanup doesn't accidentally destroy important artifacts.

188-195: LGTM!

Clean output formatting with paths and inline display of the markdown results. Good UX for developers running the script.

180-184: No issue here. The test script test_perf_regression.py exists in maint/scripts/ at the location referenced by ${SCRIPT_DIR}.

Likely an incorrect or invalid review comment.

LeiWang1999 added 3 commits December 28, 2025 01:06

Add .perf_regression/ to .gitignore for performance regression tests

06f0205

Enhance performance regression script to handle existing build direct…

5f79eac

…ories by backing them up before installation and restoring them afterward.

Update performance regression script to clean additional build artifa…

e7c5a57

…cts, including .perf_regression and build.bak.*

coderabbitai bot reviewed Dec 28, 2025

View reviewed changes

LeiWang1999 merged commit f57956d into tile-ai:main Dec 28, 2025
6 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Script] Provide regression test script to help benchmark regression in local env #1551

[Script] Provide regression test script to help benchmark regression in local env #1551

LeiWang1999 commented Dec 28, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Dec 28, 2025

Uh oh!

coderabbitai bot commented Dec 28, 2025 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[Script] Provide regression test script to help benchmark regression in local env #1551

[Script] Provide regression test script to help benchmark regression in local env #1551

Conversation

LeiWang1999 commented Dec 28, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

github-actions bot commented Dec 28, 2025

Uh oh!

coderabbitai bot commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

LeiWang1999 commented Dec 28, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 28, 2025 •

edited

Loading