Skip to content

Conversation

@xwhzz
Copy link
Contributor

@xwhzz xwhzz commented Dec 28, 2025

Summary by CodeRabbit

  • Chores
    • Reorganized CI regression workflow to add a dedicated permission-check step and streamline subsequent job flow.
    • Renamed report messaging to "Performance Regression Test Report" for clarity.
    • Improved CI cleanup to remove build/artifact directories during baseline setup.
    • Simplified and consolidated local performance/regression tooling, replacing older scripts with a unified regression runner and tightening test invocation.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 28, 2025

📝 Walkthrough

Walkthrough

Refactors perf-regression tooling: moves multi-driver runner into maint/scripts, removes a legacy perf script, simplifies tilelang.testing perf_regression API/exports, and updates GitHub Actions to add a separate permissions-check job that pr-regression depends on.

Changes

Cohort / File(s) Summary
GitHub Actions workflow
\.github/workflows/pr-regression-test-bot.yml
Added permissions-check job that determines commenter permission via GitHub Script; made pr-regression depend on it (needs: [permissions-check]); removed inline permission-check steps; updated PR comment title to "Performance Regression Test Report"; added artifact cleanup (rm -rf tilelang build); switched environment/path usage for regression tests.
Removed legacy perf script
maint/scripts/performance.py
Deleted file containing autotuned matmul benchmark, kernel definition, tuner config, and CLI entry point.
New multi-driver runner (maintenance)
maint/scripts/regression_all.py
New runner: PerfResult dataclass, discovery of regression_*.py drivers, subprocess-based execution, parsing of JSON/text latency markers, aggregated results, optional JSON/tabular output, and CLI entry point.
Test harness update
maint/scripts/test_perf_regression.py
Switched from inline python -c invocations to using a deterministic Path to regression_all.py; added pathlib.Path usage; minor callsite formatting changes.
Testing public API changes
tilelang/testing/__init__.py
Removed regression_all export; added exports for compute-version decorators: requires_cuda_compute_version_ge/gt/eq/lt/le.
Simplified perf regression helper
tilelang/testing/perf_regression.py
Removed multi-driver orchestration (deleted _parse_table, _examples_root, _discover_bench_files, regression_all); eliminated external deps (tabulate, tqdm) and subprocess scaffolding; retained core latency collection and single-result JSON/text emission (_emit_results).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Commenter as PR Commenter
  participant GitHub as GitHub PR
  participant Actions as GitHub Actions Workflow
  participant PermCheck as permissions-check job
  participant PRReg as pr-regression job

  Note over Commenter,GitHub: A comment triggers the workflow
  Commenter->>GitHub: Add comment / trigger
  GitHub->>Actions: Start pr-regression workflow
  Actions->>PermCheck: Run permissions-check (GitHub Script)
  PermCheck-->>Actions: Emit permission result (output)
  Note right of Actions: permission output stored in job outputs
  Actions->>PRReg: pr-regression starts with needs:[permissions-check]
  PRReg->>PRReg: Run regression steps (uses emitted permission)
  PRReg-->>GitHub: Post "Performance Regression Test Report" comment/artifact
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • LeiWang1999

Poem

🐰 I hopped through scripts and moved a few,
Removed old kernels, made runners new,
Permissions checked before the test,
Clean reports now stand the rest,
A rabbit's cheer — the infra's best!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: refactoring regression tests to use the installed package (via regression_all.py in maint/scripts/) instead of relying on source directory imports, as evidenced by the removal of regression_all from tilelang/testing, deletion of performance.py, and updates to test infrastructure.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ebd70b and a19329b.

📒 Files selected for processing (1)
  • .github/workflows/pr-regression-test-bot.yml
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:29:29.347Z
Learnt from: XuehaiPan
Repo: tile-ai/tilelang PR: 973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.347Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

  • .github/workflows/pr-regression-test-bot.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
  • GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
🔇 Additional comments (4)
.github/workflows/pr-regression-test-bot.yml (4)

34-53: LGTM! Good security pattern for gating expensive operations.

The dedicated permissions-check job properly validates that only users with admin/maintain/write permissions can trigger regression tests. Moving this check into a separate job that gates the expensive pr-regression job is a clean optimization.


60-60: LGTM! Proper job dependency for authorization gate.

The needs: [permissions-check] dependency ensures the expensive regression test only runs after authorization is confirmed.


225-230: LGTM! Activation path correctly updated.

The activation of test_regression environment (line 227) aligns with the new venv created at line 194. The test script correctly uses OLD_PYTHON and NEW_PYTHON to reference the separate installed environments for comparison.


256-266: LGTM! Terminology updated for consistency.

The message text now uses "Performance Regression Test Report" instead of "Performance Benchmark Report", which aligns better with the workflow's name and purpose.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d70cf36 and 7ebd70b.

📒 Files selected for processing (6)
  • .github/workflows/pr-regression-test-bot.yml
  • maint/scripts/performance.py
  • maint/scripts/regression_all.py
  • maint/scripts/test_perf_regression.py
  • tilelang/testing/__init__.py
  • tilelang/testing/perf_regression.py
💤 Files with no reviewable changes (2)
  • tilelang/testing/perf_regression.py
  • maint/scripts/performance.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:29:29.347Z
Learnt from: XuehaiPan
Repo: tile-ai/tilelang PR: 973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.347Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

  • .github/workflows/pr-regression-test-bot.yml
🧬 Code graph analysis (2)
maint/scripts/regression_all.py (1)
tilelang/testing/perf_regression.py (1)
  • PerfResult (13-15)
tilelang/testing/__init__.py (1)
tilelang/testing/perf_regression.py (2)
  • process_func (49-66)
  • regression (69-88)
🪛 Ruff (0.14.10)
maint/scripts/regression_all.py

12-12: Do not catch blind exception: Exception

(BLE001)


19-19: Unused function argument: kwargs

(ARG001)


88-88: Avoid specifying long messages outside the exception class

(TRY003)


92-92: Avoid specifying long messages outside the exception class

(TRY003)


98-98: subprocess call: check for execution of untrusted input

(S603)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
🔇 Additional comments (14)
maint/scripts/test_perf_regression.py (2)

10-10: LGTM!

The Path import is correctly added to support the file path construction for the regression script.


184-187: LGTM!

The change correctly executes regression_all.py as a script file rather than importing from the installed package. This ensures consistent behavior between OLD and NEW Python interpreters by using the same script path from the repository.

.github/workflows/pr-regression-test-bot.yml (4)

34-52: LGTM! Authorization gate properly implemented.

The permissions-check job correctly verifies collaborator permissions before allowing the expensive regression test to run. The check for admin, maintain, or write permissions is appropriate for protecting self-hosted runner resources.


60-60: LGTM!

The needs: [permissions-check] dependency correctly ensures authorization is verified before the performance regression job runs.


192-192: Critical fix for ensuring installed package usage.

Removing tilelang source directory and build artifacts after installing the baseline version is essential. This prevents Python from inadvertently importing from the source directory instead of the installed package, which is the core issue this PR addresses.


253-254: LGTM!

The title accurately reflects the purpose of the workflow as a regression test rather than a general benchmark.

tilelang/testing/__init__.py (1)

11-11: LGTM!

The import correctly removes regression_all as it has been relocated to maint/scripts/regression_all.py as a standalone script, reducing the public API surface of the testing module.

maint/scripts/regression_all.py (7)

10-20: LGTM on optional dependency handling.

The try/except pattern for optional dependencies (tabulate, tqdm) is appropriate for a standalone script. The tqdm fallback correctly returns the iterable unchanged.


23-31: Intentional duplication for standalone script.

The PerfResult dataclass duplicates the one in tilelang/testing/perf_regression.py, which is appropriate since this script must be self-contained and cannot import from the installed package (that's the whole point of the fix). The frozen=True addition is a nice improvement for immutability.


34-63: LGTM!

The parsing logic is well-designed: it prioritizes the structured JSON marker (searching from the end for efficiency) and falls back to best-effort text parsing. Error handling for malformed values is appropriate.


71-77: LGTM!

File discovery is correctly implemented with proper filtering for actual files and exclusion of __init__.py.


97-120: LGTM on subprocess execution logic.

The subprocess calls execute benchmark files discovered from the repository's examples directory using sys.executable. The S603 static analysis warning about untrusted input is a false positive here since the paths come from rglob on a known repository directory, not user input.

The "first writer wins" deduplication strategy and partial failure handling are well-designed for robustness.


132-145: LGTM!

The output formatting is flexible with JSON and text modes, and gracefully falls back to a simple ASCII table when tabulate is unavailable.


148-149: LGTM!

Standard entry point pattern for script execution.

Comment on lines +66 to +68
def _examples_root() -> Path:
# repo_root/tilelang/testing/perf_regression.py -> repo_root
return Path(__file__).resolve().parents[2] / "examples"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Stale comment referencing old file location.

The comment mentions repo_root/tilelang/testing/perf_regression.py but this file is at maint/scripts/regression_all.py. The path calculation itself is correct (parents[2] navigates from maint/scripts/regression_all.pymaint/scriptsmaintrepo_root).

🔎 Proposed fix
 def _examples_root() -> Path:
-    # repo_root/tilelang/testing/perf_regression.py -> repo_root
+    # maint/scripts/regression_all.py -> repo_root
     return Path(__file__).resolve().parents[2] / "examples"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _examples_root() -> Path:
# repo_root/tilelang/testing/perf_regression.py -> repo_root
return Path(__file__).resolve().parents[2] / "examples"
def _examples_root() -> Path:
# maint/scripts/regression_all.py -> repo_root
return Path(__file__).resolve().parents[2] / "examples"
🤖 Prompt for AI Agents
In maint/scripts/regression_all.py around lines 66 to 68, the inline comment
incorrectly references repo_root/tilelang/testing/perf_regression.py even though
the code is in maint/scripts/regression_all.py; update the comment to reflect
the actual file path used for the parents traversal (e.g.,
"repo_root/maint/scripts/regression_all.py -> repo_root") or remove the stale
comment entirely so it matches the code's current location and intent.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
.github/workflows/pr-regression-test-bot.yml (2)

181-196: Workflow must checkout back to PR branch before running tests.

The workflow checks out main at line 185 but never switches back to the PR branch. At line 230, the test script ./maint/scripts/test_perf_regression.py runs from the main branch context.

This is a critical bug: The PR refactors the test infrastructure by introducing a new maint/scripts/regression_all.py script and modifying test_perf_regression.py to use it. However, regression_all.py does not exist in main, and main's version of test_perf_regression.py still uses the old inline approach. The refactored test code in the PR is never actually tested.

Add git checkout - after line 191 to return to the PR branch before running tests at line 230.


8-11: Add actions: write permission to save caches from setup-uv.

The workflow uses astral-sh/setup-uv@v7 with caching enabled (lines 154–169), but the permissions block (lines 8–11) lacks actions: write. This permission is required for setup-uv to save caches; without it, cache upload operations will fail and caches will not persist, causing slower CI runs over time as they remain cold.

Add actions: write to the permissions block:

permissions:
  actions: write
  contents: read
  issues: write
  pull-requests: write
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ebd70b and a19329b.

📒 Files selected for processing (1)
  • .github/workflows/pr-regression-test-bot.yml
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:29:29.347Z
Learnt from: XuehaiPan
Repo: tile-ai/tilelang PR: 973
File: .github/workflows/ci.yml:13-15
Timestamp: 2025-10-10T13:29:29.347Z
Learning: In .github/workflows/ci.yml for tilelang (GitHub Actions), actions/cachev4 and setup-python’s cache feature require GITHUB_TOKEN with actions: write to save caches; with a permissions block that only sets contents: read, unspecified actions permission becomes none, so caches will restore but not save.

Applied to files:

  • .github/workflows/pr-regression-test-bot.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
  • GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
🔇 Additional comments (4)
.github/workflows/pr-regression-test-bot.yml (4)

34-53: LGTM! Good security pattern for gating expensive operations.

The dedicated permissions-check job properly validates that only users with admin/maintain/write permissions can trigger regression tests. Moving this check into a separate job that gates the expensive pr-regression job is a clean optimization.


60-60: LGTM! Proper job dependency for authorization gate.

The needs: [permissions-check] dependency ensures the expensive regression test only runs after authorization is confirmed.


225-230: LGTM! Activation path correctly updated.

The activation of test_regression environment (line 227) aligns with the new venv created at line 194. The test script correctly uses OLD_PYTHON and NEW_PYTHON to reference the separate installed environments for comparison.


256-266: LGTM! Terminology updated for consistency.

The message text now uses "Performance Regression Test Report" instead of "Performance Benchmark Report", which aligns better with the workflow's name and purpose.

@LeiWang1999
Copy link
Member

Would be better for us to enable ccache as we should build from source twice?

@LeiWang1999
Copy link
Member

likely we will automatically enable ccache, lgtm.

@LeiWang1999 LeiWang1999 merged commit 23ede42 into tile-ai:main Dec 28, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants