-
Notifications
You must be signed in to change notification settings - Fork 359
add detection of cuda_home from package nvidia-cuda-nvcc #1528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
👋 Hi! Thank you for contributing to the TileLang project. Please remember to run We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀 |
📝 WalkthroughWalkthroughAdds two CUDA-home discovery fallbacks in Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (2)📓 Common learnings📚 Learning: 2025-12-24T17:20:27.444ZApplied to files:
🔇 Additional comments (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tilelang/env.py (1)
91-93: Non-deterministic CUDA version selection when multiple versions are installed.
glob.glob()returns paths in arbitrary filesystem order. If multiple CUDA versions are installed (e.g.,v11.8,v12.0,v12.4), selectingcuda_homes[0]gives unpredictable results across runs or machines.🔎 Proposed fix to prefer the latest CUDA version
if sys.platform == "win32": cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*") - cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0] + if cuda_homes: + # Sort to prefer the latest version (e.g., v12.4 over v11.8) + cuda_homes.sort(reverse=True) + cuda_home = cuda_homes[0] + else: + cuda_home = ""
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tilelang/env.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.
📚 Learning: 2025-12-24T17:20:27.444Z
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.
Applied to files:
tilelang/env.py
🔇 Additional comments (2)
tilelang/env.py (2)
69-87: LGTM - PyPI package detection correctly implements nvidia-cuda-nvcc paths.The candidate paths (
nvidia/cu13/bin/,nvidia/cu12/bin/,nvidia/cu11/bin/) correctly match the nvidia-cuda-nvcc PyPI package installation structure. The priority order prefers newer CUDA versions, and the fallback tonvidia/cuda_nvcc/bin/is a reasonable defensive addition. Based on learnings from the previous PR.
98-99: Verify HPC SDK fallback path is intentional.The path
/opt/nvidia/hpc_sdk/Linux_x86_64is the HPC SDK root, not a valid CUDA_HOME. The actual CUDA installation within HPC SDK is typically nested at/opt/nvidia/hpc_sdk/Linux_x86_64/<version>/cuda/<cuda_version>.This differs from Guess #2's handling (lines 62-64), which navigates from the nvcc binary to find the correct SDK path. If this fallback is reached, it's unlikely to point to a usable CUDA installation—though the validation at lines 101-103 will safely reject it.
Is this a best-effort guess that relies on the subsequent validation, or should this path be removed/adjusted for consistency?
oraluben
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This idea LGTM, with this PR, tilelang can work with a plain torch installation (with nvidia-cuda-nvcc installed), without requiring cuda toolchain.
I think we can also put nvidia-cuda-nvcc in optional dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tilelang/env.py (2)
69-78: Move import to module level for consistency.The
importlib.utilimport is placed inside the function. Consider moving it to the top of the file alongside other imports for better consistency and readability.🔎 Proposed refactor
At the top of the file, add the import:
from __future__ import annotations import sys import os import pathlib import logging import shutil import glob +import importlib.util from dataclasses import dataclassThen remove it from inside the function:
if cuda_home is None: # Guess #3 # from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc. - import importlib.util - for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]:
82-84: Consider sorting CUDA versions when multiple installations exist.The Windows fallback uses
glob.globwhich returns matches in arbitrary filesystem order. If multiple CUDA versions are installed, the selected version may be unpredictable. Consider sorting the results to select the latest version consistently.🔎 Proposed fix to select the latest CUDA version
if sys.platform == "win32": cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*") - cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0] + if len(cuda_homes) == 0: + cuda_home = "" + else: + # Sort to get the latest version (e.g., v12.8 comes after v11.8) + cuda_homes.sort(reverse=True) + cuda_home = cuda_homes[0]
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pyproject.tomltilelang/env.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.
📚 Learning: 2025-12-24T17:20:27.444Z
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.
Applied to files:
pyproject.tomltilelang/env.py
🔇 Additional comments (4)
tilelang/env.py (3)
86-90: LGTM!The Linux/macOS fallback logic correctly checks standard CUDA installation paths in a reasonable priority order.
92-96: LGTM!The path validation logic correctly ensures that only existing CUDA installations are returned, with proper fallback to an empty string for backward compatibility.
74-78: The CUDA_HOME detection from PyPI packages at lines 74-78 is correct. Thespec.submodule_search_locations[0]returns thenvidia/cu13(or cu12/cu11) directory, which is the proper CUDA_HOME location containing thebin/subdirectory with nvcc. No changes needed.pyproject.toml (1)
55-59: The specified versions for both optional dependencies are valid and available on PyPI:
- nvidia-cuda-nvcc 13.0.48 ✓ exists (released Aug 4, 2025)
- nvidia-cuda-cccl 13.0.50 ✓ exists (released Aug 4, 2025)
No known security vulnerabilities were found for these versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
tilelang/env.py (2)
93-95: Consider simplifying the validation logic.The validation can be streamlined since checking
cuda_home is Noneand then setting it toNoneis redundant.🔎 Proposed simplification
- # Validate found path - if cuda_home is None or not os.path.exists(cuda_home): - cuda_home = None + # Validate found path exists + if cuda_home and not os.path.exists(cuda_home): + cuda_home = None
72-72: Consider moving the import to the module level for better clarity.While importing
importlib.utilinside the function works, placing it at the module level (lines 1-8) would improve code organization and make dependencies more visible.🔎 Proposed change
At the top of the file (after line 7):
import shutil import glob +import importlib.util from dataclasses import dataclassThen remove the import from line 72:
if cuda_home is None: # Guess #3 # from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc. - import importlib.util if importlib.util.find_spec("nvidia") is not None:
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tilelang/env.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.
📚 Learning: 2025-12-24T17:20:27.444Z
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.
Applied to files:
tilelang/env.py
🔇 Additional comments (1)
tilelang/env.py (1)
69-79: The fix correctly prevents exceptions when nvidia packages are not installed.The guard clause at line 74 (
if importlib.util.find_spec("nvidia") is not None:) successfully prevents accessing nvidia submodules when the parent package is missing. Testing confirms thatimportlib.util.find_spec("nvidia")returnsNonewithout raising an exception when the package is not installed, and the conditional structure ensures submodule searches (lines 75-79) never execute in this scenario.
| if cuda_home is None: | ||
| # Guess #3 | ||
| # from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc. | ||
| import importlib.util | ||
|
|
||
| if importlib.util.find_spec("nvidia") is not None: | ||
| for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]: | ||
| spec = importlib.util.find_spec(f"nvidia.{submodule}") | ||
| if spec is not None and spec.submodule_search_locations: | ||
| cuda_home = os.path.join(spec.submodule_search_locations[0]) | ||
| break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
python3 << 'EOF'
import importlib.util
import os
import sys
print("=== Checking nvidia package structure ===\n")
if importlib.util.find_spec("nvidia") is not None:
print("nvidia package found\n")
for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]:
spec = importlib.util.find_spec(f"nvidia.{submodule}")
if spec is not None:
print(f"nvidia.{submodule}:")
print(f" spec.origin: {spec.origin}")
print(f" spec.submodule_search_locations: {spec.submodule_search_locations}")
if spec.submodule_search_locations:
package_path = spec.submodule_search_locations[0]
print(f" Package directory: {package_path}")
# Check if bin/nvcc exists
nvcc_path = os.path.join(package_path, "bin", "nvcc")
nvcc_exe_path = os.path.join(package_path, "bin", "nvcc.exe")
print(f" bin/nvcc exists: {os.path.isfile(nvcc_path)}")
print(f" bin/nvcc.exe exists: {os.path.isfile(nvcc_exe_path)}")
# List top-level contents
if os.path.isdir(package_path):
try:
contents = sorted(os.listdir(package_path))
print(f" Contents ({len(contents)} items): {contents}")
except Exception as e:
print(f" Error listing contents: {e}")
print()
else:
print(f"nvidia.{submodule}: NOT FOUND\n")
else:
print("nvidia package not found - installing test package...")
import subprocess
subprocess.run([sys.executable, "-m", "pip", "install", "-q", "nvidia-cuda-nvcc-cu11"], check=False)
print("\nRetrying package detection...\n")
if importlib.util.find_spec("nvidia") is not None:
for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]:
spec = importlib.util.find_spec(f"nvidia.{submodule}")
if spec is not None:
print(f"nvidia.{submodule}:")
print(f" spec.submodule_search_locations: {spec.submodule_search_locations}")
if spec.submodule_search_locations:
package_path = spec.submodule_search_locations[0]
if os.path.isdir(package_path):
contents = sorted(os.listdir(package_path))
print(f" Contents: {contents}")
print()
EOFRepository: tile-ai/tilelang
Length of output: 146
🏁 Script executed:
cd tilelang && grep -n "cuda_home" env.py | head -20Repository: tile-ai/tilelang
Length of output: 1061
🏁 Script executed:
cd tilelang && sed -n '50,100p' env.pyRepository: tile-ai/tilelang
Length of output: 2166
Remove invalid "cuda_nvcc" submodule and validate detected CUDA installation contains nvcc binary.
The code includes "cuda_nvcc" in the submodule list, but per the nvidia-cuda-nvcc package structure, only "cu13", "cu12", and "cu11" are valid submodules. The "cuda_nvcc" entry should be removed.
Additionally, while the code has an os.path.exists(cuda_home) check at line 94, it only verifies the directory exists, not that it contains a usable CUDA installation. For robustness against corrupted or incomplete packages, check that bin/nvcc (or bin/nvcc.exe on Windows) exists before setting cuda_home:
if importlib.util.find_spec("nvidia") is not None:
- for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]:
+ for submodule in ["cu13", "cu12", "cu11"]:
spec = importlib.util.find_spec(f"nvidia.{submodule}")
if spec is not None and spec.submodule_search_locations:
- cuda_home = os.path.join(spec.submodule_search_locations[0])
- break
+ candidate = spec.submodule_search_locations[0]
+ nvcc_path = os.path.join(candidate, "bin", "nvcc")
+ if sys.platform == "win32":
+ nvcc_path += ".exe"
+ if os.path.isfile(nvcc_path):
+ cuda_home = candidate
+ break📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if cuda_home is None: | |
| # Guess #3 | |
| # from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc. | |
| import importlib.util | |
| if importlib.util.find_spec("nvidia") is not None: | |
| for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]: | |
| spec = importlib.util.find_spec(f"nvidia.{submodule}") | |
| if spec is not None and spec.submodule_search_locations: | |
| cuda_home = os.path.join(spec.submodule_search_locations[0]) | |
| break | |
| if cuda_home is None: | |
| # Guess #3 | |
| # from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc. | |
| import importlib.util | |
| if importlib.util.find_spec("nvidia") is not None: | |
| for submodule in ["cu13", "cu12", "cu11"]: | |
| spec = importlib.util.find_spec(f"nvidia.{submodule}") | |
| if spec is not None and spec.submodule_search_locations: | |
| candidate = spec.submodule_search_locations[0] | |
| nvcc_path = os.path.join(candidate, "bin", "nvcc") | |
| if sys.platform == "win32": | |
| nvcc_path += ".exe" | |
| if os.path.isfile(nvcc_path): | |
| cuda_home = candidate | |
| break |
🤖 Prompt for AI Agents
In tilelang/env.py around lines 69 to 79, remove "cuda_nvcc" from the inspected
nvidia submodules (only "cu13","cu12","cu11" should be checked) and, after
discovering a candidate submodule path via spec.submodule_search_locations[0],
verify that the CUDA install contains a usable nvcc binary before assigning
cuda_home: construct the bin path using os.path.join(candidate_path, "bin",
"nvcc") and on Windows also consider "nvcc.exe", and only set cuda_home when
that file exists (use os.path.exists); otherwise continue searching or fall
through to other guesses.
Would you mind to make this work (e.g. Here's my workaround for autodetection failure: |
|
@oraluben which Dockerfile are you using? |
I ran into the error with |
|
Sorry I have trouble in setting up a docker with libcuda.so.1 to reproduce (either could not run docker, or doesn't have GPU), could you help run this in your docker python -c "import nvidia.cu13; print('1: done')"
python -c "import tilelang; print('2:', repr(tilelang.env.CUDA_HOME))"
python -c "import os; print('3:', os.environ.get('CUDA_HOME', '<not present>'))"
python -c "import os; print('4:', os.environ.get('CUDA_PATH', '<not present>'))"An idea might you have your CUDA_HOME accidentally set to empty string so it wouldn't pass |
Sorry last PR #1527 was closed by mistake, and my branch is also lost, so I prepared a new PR.
Summary by CodeRabbit
Improvements
New Features
✏️ Tip: You can customize this high-level summary in your review settings.