add detection of cuda_home from package nvidia-cuda-nvcc #1528

clouds56 · 2025-12-24T19:27:32Z

Sorry last PR #1527 was closed by mistake, and my branch is also lost, so I prepared a new PR.

Summary by CodeRabbit

Improvements
- Further improved CUDA installation path detection: now probes NVIDIA-related Python packages and adds platform-specific fallbacks on Windows and Unix-like systems to more reliably locate CUDA installations when automatic detection previously failed.
New Features
- Added a new optional "nvcc" install group to simplify installing CUDA tooling, including nvcc and related helper packages.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

github-actions · 2025-12-24T19:27:41Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

coderabbitai · 2025-12-24T19:27:42Z

📝 Walkthrough

Walkthrough

Adds two CUDA-home discovery fallbacks in tilelang/env.py (inspect NVIDIA-related Python packages; platform-specific filesystem paths) and a new optional nvcc dependency group in pyproject.toml. No public API signature changes. (47 words)

Changes

Cohort / File(s)	Summary
CUDA Home Discovery Enhancement `tilelang/env.py`	Adds Guess `#3`: detect CUDA home by inspecting NVIDIA-related Python packages (e.g., `nvidia.cu13`, `nvidia.cu12`, `nvidia.cu11`, `nvidia.cuda_nvcc`) via `importlib.util.find_spec` and use the package location when applicable. Adds Guess `#4`: platform-specific filesystem fallbacks — Windows: search `C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v.`; Non-Windows: check `/usr/local/cuda` and `/opt/nvidia/hpc_sdk/...`. These runs after prior guesses and before existing defaults; invalid paths normalize to `""`.
Optional Dependency `pyproject.toml`	Adds new optional-dependencies group `nvcc` with `nvidia-cuda-nvcc>=13.0.48` and `nvidia-cuda-cccl>=13.0.50`. No other dependency or public API changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

[Bugfix] Add NVIDIA HPC SDK support in CUDA detection (#974) #976: Modifies the same _find_cuda_home detection logic to include NVIDIA HPC SDK and related filesystem fallbacks.

Suggested reviewers

LeiWang1999

Poem

🐰 I sniffed through packages, paths both near and far,

Found nvcc hiding where binaries are.
I nudged pyproject, added a tiny seed,
Now GPUs wake up when builders take the lead. ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding detection of CUDA_HOME from the nvidia-cuda-nvcc package, which aligns with the primary objective of improving CUDA installation autodetection.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dfe5550 and f299e7b.

📒 Files selected for processing (1)

tilelang/env.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

📚 Learning: 2025-12-24T17:20:27.444Z

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

tilelang/env.py

🔇 Additional comments (1)

tilelang/env.py (1)

53-53: Good fix for empty CUDA_HOME handling.

The addition of or None correctly treats empty string values from environment variables as non-present, preventing them from being used as invalid CUDA paths. This aligns with the commit message and addresses the reported issue where CUDA_HOME might be set to an empty string.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

tilelang/env.py (1)
91-93: Non-deterministic CUDA version selection when multiple versions are installed.

glob.glob() returns paths in arbitrary filesystem order. If multiple CUDA versions are installed (e.g., v11.8, v12.0, v12.4), selecting cuda_homes[0] gives unpredictable results across runs or machines.
🔎 Proposed fix to prefer the latest CUDA version
         if sys.platform == "win32":
             cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*")
-            cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0]
+            if cuda_homes:
+                # Sort to prefer the latest version (e.g., v12.4 over v11.8)
+                cuda_homes.sort(reverse=True)
+                cuda_home = cuda_homes[0]
+            else:
+                cuda_home = ""

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d140415 and e801a01.

📒 Files selected for processing (1)

tilelang/env.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

📚 Learning: 2025-12-24T17:20:27.444Z

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

tilelang/env.py

🔇 Additional comments (2)

tilelang/env.py (2)

69-87: LGTM - PyPI package detection correctly implements nvidia-cuda-nvcc paths.

The candidate paths (nvidia/cu13/bin/, nvidia/cu12/bin/, nvidia/cu11/bin/) correctly match the nvidia-cuda-nvcc PyPI package installation structure. The priority order prefers newer CUDA versions, and the fallback to nvidia/cuda_nvcc/bin/ is a reasonable defensive addition. Based on learnings from the previous PR.

98-99: Verify HPC SDK fallback path is intentional.

The path /opt/nvidia/hpc_sdk/Linux_x86_64 is the HPC SDK root, not a valid CUDA_HOME. The actual CUDA installation within HPC SDK is typically nested at /opt/nvidia/hpc_sdk/Linux_x86_64/<version>/cuda/<cuda_version>.

This differs from Guess #2's handling (lines 62-64), which navigates from the nvcc binary to find the correct SDK path. If this fallback is reached, it's unlikely to point to a usable CUDA installation—though the validation at lines 101-103 will safely reject it.

Is this a best-effort guess that relies on the subsequent validation, or should this path be removed/adjusted for consistency?

oraluben

This idea LGTM, with this PR, tilelang can work with a plain torch installation (with nvidia-cuda-nvcc installed), without requiring cuda toolchain.

I think we can also put nvidia-cuda-nvcc in optional dependency.

tilelang/env.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

tilelang/env.py (2)
69-78: Move import to module level for consistency.

The importlib.util import is placed inside the function. Consider moving it to the top of the file alongside other imports for better consistency and readability.
🔎 Proposed refactor

At the top of the file, add the import:
from __future__ import annotations
import sys
import os
import pathlib
import logging
import shutil
import glob
+import importlib.util
from dataclasses import dataclass
Then remove it from inside the function:
     if cuda_home is None:
         # Guess #3
         # from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc.
-        import importlib.util
-
         for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]:
82-84: Consider sorting CUDA versions when multiple installations exist.

The Windows fallback uses glob.glob which returns matches in arbitrary filesystem order. If multiple CUDA versions are installed, the selected version may be unpredictable. Consider sorting the results to select the latest version consistently.
🔎 Proposed fix to select the latest CUDA version
         if sys.platform == "win32":
             cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*")
-            cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0]
+            if len(cuda_homes) == 0:
+                cuda_home = ""
+            else:
+                # Sort to get the latest version (e.g., v12.8 comes after v11.8)
+                cuda_homes.sort(reverse=True)
+                cuda_home = cuda_homes[0]

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e801a01 and 5d53b22.

📒 Files selected for processing (2)

pyproject.toml
tilelang/env.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

📚 Learning: 2025-12-24T17:20:27.444Z

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

pyproject.toml
tilelang/env.py

🔇 Additional comments (4)

tilelang/env.py (3)

86-90: LGTM!

The Linux/macOS fallback logic correctly checks standard CUDA installation paths in a reasonable priority order.

92-96: LGTM!

The path validation logic correctly ensures that only existing CUDA installations are returned, with proper fallback to an empty string for backward compatibility.

74-78: The CUDA_HOME detection from PyPI packages at lines 74-78 is correct. The spec.submodule_search_locations[0] returns the nvidia/cu13 (or cu12/cu11) directory, which is the proper CUDA_HOME location containing the bin/ subdirectory with nvcc. No changes needed.

pyproject.toml (1)

55-59: The specified versions for both optional dependencies are valid and available on PyPI:

nvidia-cuda-nvcc 13.0.48 ✓ exists (released Aug 4, 2025)

nvidia-cuda-cccl 13.0.50 ✓ exists (released Aug 4, 2025)

No known security vulnerabilities were found for these versions.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

tilelang/env.py (2)
93-95: Consider simplifying the validation logic.

The validation can be streamlined since checking cuda_home is None and then setting it to None is redundant.
🔎 Proposed simplification
-        # Validate found path
-        if cuda_home is None or not os.path.exists(cuda_home):
-            cuda_home = None
+        # Validate found path exists
+        if cuda_home and not os.path.exists(cuda_home):
+            cuda_home = None
72-72: Consider moving the import to the module level for better clarity.

While importing importlib.util inside the function works, placing it at the module level (lines 1-8) would improve code organization and make dependencies more visible.
🔎 Proposed change

At the top of the file (after line 7):
 import shutil
 import glob
+import importlib.util
 from dataclasses import dataclass
Then remove the import from line 72:
     if cuda_home is None:
         # Guess #3
         # from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc.
-        import importlib.util
 
         if importlib.util.find_spec("nvidia") is not None:

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d53b22 and dfe5550.

📒 Files selected for processing (1)

tilelang/env.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

📚 Learning: 2025-12-24T17:20:27.444Z

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

tilelang/env.py

🔇 Additional comments (1)

tilelang/env.py (1)

69-79: The fix correctly prevents exceptions when nvidia packages are not installed.

The guard clause at line 74 (if importlib.util.find_spec("nvidia") is not None:) successfully prevents accessing nvidia submodules when the parent package is missing. Testing confirms that importlib.util.find_spec("nvidia") returns None without raising an exception when the package is not installed, and the conditional structure ensures submodule searches (lines 75-79) never execute in this scenario.

coderabbitai · 2025-12-25T18:53:20Z

tilelang/env.py

+    if cuda_home is None:
+        # Guess #3
+        # from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc.
+        import importlib.util
+
+        if importlib.util.find_spec("nvidia") is not None:
+            for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]:
+                spec = importlib.util.find_spec(f"nvidia.{submodule}")
+                if spec is not None and spec.submodule_search_locations:
+                    cuda_home = os.path.join(spec.submodule_search_locations[0])
+                    break


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

python3 << 'EOF' import importlib.util import os import sys print("=== Checking nvidia package structure ===\n") if importlib.util.find_spec("nvidia") is not None: print("nvidia package found\n") for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]: spec = importlib.util.find_spec(f"nvidia.{submodule}") if spec is not None: print(f"nvidia.{submodule}:") print(f" spec.origin: {spec.origin}") print(f" spec.submodule_search_locations: {spec.submodule_search_locations}") if spec.submodule_search_locations: package_path = spec.submodule_search_locations[0] print(f" Package directory: {package_path}") # Check if bin/nvcc exists nvcc_path = os.path.join(package_path, "bin", "nvcc") nvcc_exe_path = os.path.join(package_path, "bin", "nvcc.exe") print(f" bin/nvcc exists: {os.path.isfile(nvcc_path)}") print(f" bin/nvcc.exe exists: {os.path.isfile(nvcc_exe_path)}") # List top-level contents if os.path.isdir(package_path): try: contents = sorted(os.listdir(package_path)) print(f" Contents ({len(contents)} items): {contents}") except Exception as e: print(f" Error listing contents: {e}") print() else: print(f"nvidia.{submodule}: NOT FOUND\n") else: print("nvidia package not found - installing test package...") import subprocess subprocess.run([sys.executable, "-m", "pip", "install", "-q", "nvidia-cuda-nvcc-cu11"], check=False) print("\nRetrying package detection...\n") if importlib.util.find_spec("nvidia") is not None: for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]: spec = importlib.util.find_spec(f"nvidia.{submodule}") if spec is not None: print(f"nvidia.{submodule}:") print(f" spec.submodule_search_locations: {spec.submodule_search_locations}") if spec.submodule_search_locations: package_path = spec.submodule_search_locations[0] if os.path.isdir(package_path): contents = sorted(os.listdir(package_path)) print(f" Contents: {contents}") print() EOF

Repository: tile-ai/tilelang

Length of output: 146

🏁 Script executed:

cd tilelang && grep -n "cuda_home" env.py | head -20

Repository: tile-ai/tilelang

Length of output: 1061

🏁 Script executed:

cd tilelang && sed -n '50,100p' env.py

Repository: tile-ai/tilelang

Length of output: 2166

Remove invalid "cuda_nvcc" submodule and validate detected CUDA installation contains nvcc binary.

The code includes "cuda_nvcc" in the submodule list, but per the nvidia-cuda-nvcc package structure, only "cu13", "cu12", and "cu11" are valid submodules. The "cuda_nvcc" entry should be removed.

Additionally, while the code has an os.path.exists(cuda_home) check at line 94, it only verifies the directory exists, not that it contains a usable CUDA installation. For robustness against corrupted or incomplete packages, check that bin/nvcc (or bin/nvcc.exe on Windows) exists before setting cuda_home:

if importlib.util.find_spec("nvidia") is not None: - for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]: + for submodule in ["cu13", "cu12", "cu11"]: spec = importlib.util.find_spec(f"nvidia.{submodule}") if spec is not None and spec.submodule_search_locations: - cuda_home = os.path.join(spec.submodule_search_locations[0]) - break + candidate = spec.submodule_search_locations[0] + nvcc_path = os.path.join(candidate, "bin", "nvcc") + if sys.platform == "win32": + nvcc_path += ".exe" + if os.path.isfile(nvcc_path): + cuda_home = candidate + break

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if cuda_home is None:

# Guess #3

# from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc.

import importlib.util

if importlib.util.find_spec("nvidia") is not None:

for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]:

spec = importlib.util.find_spec(f"nvidia.{submodule}")

if spec is not None and spec.submodule_search_locations:

cuda_home = os.path.join(spec.submodule_search_locations[0])

break

if cuda_home is None:

# Guess #3

# from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc.

import importlib.util

if importlib.util.find_spec("nvidia") is not None:

for submodule in ["cu13", "cu12", "cu11"]:

spec = importlib.util.find_spec(f"nvidia.{submodule}")

if spec is not None and spec.submodule_search_locations:

candidate = spec.submodule_search_locations[0]

nvcc_path = os.path.join(candidate, "bin", "nvcc")

if sys.platform == "win32":

nvcc_path += ".exe"

if os.path.isfile(nvcc_path):

cuda_home = candidate

break

🤖 Prompt for AI Agents

In tilelang/env.py around lines 69 to 79, remove "cuda_nvcc" from the inspected nvidia submodules (only "cu13","cu12","cu11" should be checked) and, after discovering a candidate submodule path via spec.submodule_search_locations[0], verify that the CUDA install contains a usable nvcc binary before assigning cuda_home: construct the bin path using os.path.join(candidate_path, "bin", "nvcc") and on Windows also consider "nvcc.exe", and only set cuda_home when that file exists (use os.path.exists); otherwise continue searching or fall through to other guesses.

tilelang/env.py

oraluben · 2025-12-26T03:23:42Z

with this PR, tilelang can work with a plain torch installation (with nvidia-cuda-nvcc installed), without requiring cuda toolchain.

Would you mind to make this work (e.g. docker run -ti --rm --gpus all ubuntu and inside docker just install nvcc and torch via pip)? Currently I got following error in that scenario:

(venv) root@8025c5faee4e:/# python /t/examples/gemm/example_gemm.py 
/venv/lib/python3.12/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled.
We recommend installing via `pip install torch-c-dlpack-ext`
  warnings.warn(
/venv/lib/python3.12/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled.
We recommend installing via `pip install torch-c-dlpack-ext`
  warnings.warn(
2025-12-26 03:21:42  [TileLang:tilelang.jit.kernel:INFO]: TileLang begins to compile kernel `gemm` with `out_idx=[-1]`
Traceback (most recent call last):
  File "/t/examples/gemm/example_gemm.py", line 67, in <module>
    main()
  File "/t/examples/gemm/example_gemm.py", line 30, in main
    kernel = matmul(1024, 1024, 1024, 128, 128, 32)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/__init__.py", line 423, in __call__
    kernel = self.compile(*args, **kwargs, **tune_params)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/__init__.py", line 355, in compile
    kernel_result = compile(
                    ^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/__init__.py", line 99, in compile
    return cached(
           ^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/cache/__init__.py", line 30, in cached
    return _kernel_cache_instance.cached(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/cache/kernel_cache.py", line 236, in cached
    kernel = JITKernel(
             ^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/kernel.py", line 137, in __init__
    adapter = self._compile_and_create_adapter(func, out_idx)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/kernel.py", line 242, in _compile_and_create_adapter
    artifact = tilelang.lower(
               ^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/engine/lower.py", line 275, in lower
    codegen_mod = device_codegen(device_mod, target) if enable_device_compile else device_codegen_without_compile(device_mod, target)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/engine/lower.py", line 198, in device_codegen
    device_mod = tvm.ffi.get_global_func(global_func)(device_mod, target)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "python/tvm_ffi/cython/function.pxi", line 923, in tvm_ffi.core.Function.__call__
  File "<unknown>", line 0, in tvm::codegen::BuildTileLangCUDA(tvm::IRModule, tvm::Target)
  File "python/tvm_ffi/cython/function.pxi", line 1077, in tvm_ffi.core.tvm_ffi_callback
  File "/venv/lib/python3.12/site-packages/tilelang/engine/lower.py", line 114, in tilelang_callback_cuda_compile
    ptx = nvcc.compile_cuda(

  File "/venv/lib/python3.12/site-packages/tilelang/contrib/nvcc.py", line 77, in compile_cuda
    cmd = [get_nvcc_compiler()]

  File "/venv/lib/python3.12/site-packages/tilelang/contrib/nvcc.py", line 592, in get_nvcc_compiler
    return os.path.join(find_cuda_path(), "bin", "nvcc")

  File "/venv/lib/python3.12/site-packages/tilelang/contrib/nvcc.py", line 275, in find_cuda_path
    raise RuntimeError(

RuntimeError: Failed to automatically detect CUDA installation. Please set the CUDA_HOME environment variable manually (e.g., export CUDA_HOME=/usr/local/cuda).

Here's my workaround for autodetection failure:

diff --git a/examples/gemm/example_gemm.py b/examples/gemm/example_gemm.py
index dfa43112..c945d8eb 100644
--- a/examples/gemm/example_gemm.py
+++ b/examples/gemm/example_gemm.py
@@ -2,7 +2,7 @@ import tilelang
 import tilelang.language as T
 
 
[email protected](out_idx=[-1])
[email protected](out_idx=[-1], target='cuda')
 def matmul(M, N, K, block_M, block_N, block_K, dtype=T.float16, accum_dtype=T.float32):
     @T.prim_func
     def gemm(

clouds56 · 2025-12-26T19:34:14Z

@oraluben which Dockerfile are you using?
You could manually install nvidia-cuda-nvcc in the Dockerfile, via pip install nvidia-cuda-nvcc nvidia-cuda-cccl or uv add nvidia-cuda-nvcc nvidia-cuda-cccl, or uv add "cuda-toolkit[nvcc,cccl]", or uv add tilelang --optional nvcc

oraluben · 2025-12-27T05:42:14Z

@oraluben which Dockerfile are you using? You could manually install nvidia-cuda-nvcc in the Dockerfile, via pip install nvidia-cuda-nvcc nvidia-cuda-cccl or uv add nvidia-cuda-nvcc nvidia-cuda-cccl, or uv add "cuda-toolkit[nvcc,cccl]", or uv add tilelang --optional nvcc

I ran into the error with nvidia-cuda-nvcc installed.

clouds56 · 2025-12-28T00:10:57Z

Sorry I have trouble in setting up a docker with libcuda.so.1 to reproduce (either could not run docker, or doesn't have GPU), could you help run this in your docker

python -c "import nvidia.cu13; print('1: done')"
python -c "import tilelang; print('2:', repr(tilelang.env.CUDA_HOME))"
python -c "import os; print('3:', os.environ.get('CUDA_HOME', '<not present>'))"
python -c "import os; print('4:', os.environ.get('CUDA_PATH', '<not present>'))"

An idea might you have your CUDA_HOME accidentally set to empty string so it wouldn't pass if cuda_home is None

add detection of cuda_home from package nvidia-cuda-nvcc

e801a01

coderabbitai bot reviewed Dec 24, 2025

View reviewed changes

oraluben requested changes Dec 25, 2025

View reviewed changes

tilelang/env.py Outdated Show resolved Hide resolved

use importlib.util.find_spec

5d53b22

coderabbitai bot reviewed Dec 25, 2025

View reviewed changes

fix raise if nvidia not installed

dfe5550

coderabbitai bot reviewed Dec 25, 2025

View reviewed changes

treat empty CUDA_HOME non present

f299e7b

add detection of cuda_home from package nvidia-cuda-nvcc #1528

Are you sure you want to change the base?

add detection of cuda_home from package nvidia-cuda-nvcc #1528

Uh oh!

Conversation

clouds56 commented Dec 24, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

github-actions bot commented Dec 24, 2025

Uh oh!

coderabbitai bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

oraluben left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oraluben commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clouds56 commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oraluben commented Dec 27, 2025

Uh oh!

clouds56 commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

clouds56 commented Dec 24, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 24, 2025 •

edited

Loading

oraluben commented Dec 26, 2025 •

edited

Loading

clouds56 commented Dec 26, 2025 •

edited

Loading

clouds56 commented Dec 28, 2025 •

edited

Loading