fix: safer gpu usage - dropping SLURM env to avoid conflicting vars, … by cmeesters · Pull Request #46 · snakemake/snakemake-executor-plugin-slurm-jobstep

cmeesters · 2026-05-22T12:45:44Z

dropping SLURM env to avoid conflicting vars
explicit node requirement to avoid splitting onto several nodes, if not required.

Summary by CodeRabbit

Improvements
- Logs full environment at initialization for easier debugging.
- Appends GPU and node placement flags to task launches; defaults to one node when unset.
- Sanitizes inherited SLURM-related environment variables for nested launches.
Bug Fixes
- Stricter validation of integer resource values with clearer error reporting.
Tests
- New unit tests covering GPU/node flag formatting, defaults, and error cases.

…explicit node requirement to avoid splitting onto several nodes, if not required.

coderabbitai · 2026-05-22T12:46:32Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f143c520-3208-4c03-a1b5-a304b67e3e92

📥 Commits

Reviewing files that changed from the base of the PR and between 6e505c0 and eadf5a5.

📒 Files selected for processing (2)

snakemake_executor_plugin_slurm_jobstep/__init__.py
tests/tests.py

💤 Files with no reviewable changes (2)

snakemake_executor_plugin_slurm_jobstep/init.py
tests/tests.py

📝 Walkthrough

Walkthrough

Adds strict integer validation for GPU/node resource helpers with tests, appends their flags to SMP srun invocations, logs the process environment at executor init, and sanitizes inherited SLURM-related environment variables before launching nested srun processes.

Changes

SLURM Resource Helpers and Environment Sanitization

Layer / File(s)	Summary
Resource helper functions and validation `snakemake_executor_plugin_slurm_jobstep/__init__.py`, `tests/tests.py`	Introduces `_coerce_int_resource` and reimplements `get_gpu_setting` and `get_node_setting` to coerce and validate integer resource values, return explicit CLI flags or empty strings, and raise `WorkflowError` on invalid values. Unit tests using `_DummyJob` verify flag generation, defaults, disable behavior, and error cases.
srun command construction with GPU and node options `snakemake_executor_plugin_slurm_jobstep/__init__.py`	The SMP (non-MPI, non-first-array) srun command now appends GPU and node SLURM options from the refactored helpers.
Environment logging and sanitization `snakemake_executor_plugin_slurm_jobstep/__init__.py`	Executor logs full `os.environ` in `__post_init__`. Before `subprocess.Popen`, a sanitized environment is built by retaining only select SLURM allocation/array variables and removing other `SLURM_`, `SRUN_`, and `SBATCH_*` entries; the filtered env is passed to `Popen`.

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 6.25% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title refers to GPU safety improvements and environment variable management, which aligns with the main changes: sanitizing SLURM environment variables and adding GPU/node placement logic.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/gpu_stabil

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

snakemake_executor_plugin_slurm_jobstep/__init__.py (1)
320-348: 💤 Low value

Inconsistent handling of zero/negative values between GPU and node settings.

get_gpu_setting silently returns "" for zero or negative values (lines 330-331), while get_node_setting raises WorkflowError for values ≤ 0 (lines 346-347). This asymmetry may be intentional (allowing GPU disable via gpus: 0 while enforcing valid node counts), but consider whether negative GPU values should also raise an error for consistency and clearer user feedback.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@snakemake_executor_plugin_slurm_jobstep/__init__.py` around lines 320 - 348,
The GPU handling should mirror node validation for negative values while still
allowing zero to explicitly disable GPUs: in get_gpu_setting (using
_coerce_int_resource and job.resources lookups for "gpus","gpu","nvidia_gpu")
explicitly treat None or 0 as the empty string (no --gpus flag) but raise
WorkflowError when gpu_value < 0 with a clear message, instead of silently
returning ""; keep positive gpu_value producing f"--gpus={gpu_value}" so
behavior is consistent with get_node_setting for invalid negatives.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@snakemake_executor_plugin_slurm_jobstep/__init__.py`:
- Around line 94-95: The debug logging currently prints the entire environment
via self.logger.debug(f"environment: {os.environ}"), which may leak secrets;
replace this with logic that filters and logs only SLURM-related variables
(e.g., keys starting with "SLURM_") or a small explicit whitelist, and if any
non-whitelisted keys must be shown, redact their values (e.g., replace with
"<REDACTED>"). Update the call that uses os.environ in __init__.py so it
constructs a filtered/sanitized dict (only SLURM_* keys or whitelist) and passes
that to self.logger.debug instead of the full os.environ.

---

Nitpick comments:
In `@snakemake_executor_plugin_slurm_jobstep/__init__.py`:
- Around line 320-348: The GPU handling should mirror node validation for
negative values while still allowing zero to explicitly disable GPUs: in
get_gpu_setting (using _coerce_int_resource and job.resources lookups for
"gpus","gpu","nvidia_gpu") explicitly treat None or 0 as the empty string (no
--gpus flag) but raise WorkflowError when gpu_value < 0 with a clear message,
instead of silently returning ""; keep positive gpu_value producing
f"--gpus={gpu_value}" so behavior is consistent with get_node_setting for
invalid negatives.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1a39fca6-b2d2-47ba-94d1-14553c9386a8

📥 Commits

Reviewing files that changed from the base of the PR and between 435d4aa and 6e505c0.

📒 Files selected for processing (2)

snakemake_executor_plugin_slurm_jobstep/__init__.py
tests/tests.py

coderabbitai · 2026-05-22T12:49:07Z

+        # print environment variables for debugging purposes
+        self.logger.debug(f"environment: {os.environ}")


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Logging full environment may expose sensitive data.

os.environ can contain secrets, API tokens, database credentials, or other sensitive values. Even at debug level, these logs may be captured and stored. Consider logging only SLURM-related variables or sanitizing sensitive keys.

Proposed fix to log only SLURM-related variables

# print environment variables for debugging purposes - self.logger.debug(f"environment: {os.environ}") + slurm_env = {k: v for k, v in os.environ.items() if k.startswith("SLURM")} + self.logger.debug(f"SLURM environment: {slurm_env}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# print environment variables for debugging purposes

self.logger.debug(f"environment: {os.environ}")

# print environment variables for debugging purposes

slurm_env = {k: v for k, v in os.environ.items() if k.startswith("SLURM")}

self.logger.debug(f"SLURM environment: {slurm_env}")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@snakemake_executor_plugin_slurm_jobstep/__init__.py` around lines 94 - 95, The debug logging currently prints the entire environment via self.logger.debug(f"environment: {os.environ}"), which may leak secrets; replace this with logic that filters and logs only SLURM-related variables (e.g., keys starting with "SLURM_") or a small explicit whitelist, and if any non-whitelisted keys must be shown, redact their values (e.g., replace with "<REDACTED>"). Update the call that uses os.environ in __init__.py so it constructs a filtered/sanitized dict (only SLURM_* keys or whitelist) and passes that to self.logger.debug instead of the full os.environ.

…tor-plugin-slurm-jobstep into fix/gpu_stabil

🤖 I have created a release *beep* *boop* --- ## [0.6.1](v0.6.0...v0.6.1) (2026-05-27) ### Bug Fixes * safer gpu usage - dropping SLURM env to avoid conflicting vars, … ([#46](#46)) ([d0c515c](d0c515c)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

fix: safer gpu usage - dropping SLURM env to avoid conflicting vars, …

6e505c0

…explicit node requirement to avoid splitting onto several nodes, if not required.

coderabbitai Bot reviewed May 22, 2026

View reviewed changes

cmeesters and others added 3 commits May 26, 2026 19:00

Merge branch 'main' into fix/gpu_stabil

5817869

fix: formatting

d3375ab

Merge branch 'fix/gpu_stabil' of github.com:snakemake/snakemake-execu…

eadf5a5

…tor-plugin-slurm-jobstep into fix/gpu_stabil

cmeesters merged commit d0c515c into main May 27, 2026
6 checks passed

cmeesters deleted the fix/gpu_stabil branch May 27, 2026 11:33

github-actions Bot mentioned this pull request May 27, 2026

chore(main): release 0.6.1 #47

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: safer gpu usage - dropping SLURM env to avoid conflicting vars, …#46

fix: safer gpu usage - dropping SLURM env to avoid conflicting vars, …#46
cmeesters merged 4 commits into
mainfrom
fix/gpu_stabil

cmeesters commented May 22, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 22, 2026 •

edited

Loading

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		# print environment variables for debugging purposes
		self.logger.debug(f"environment: {os.environ}")

Conversation

cmeesters commented May 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cmeesters commented May 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 22, 2026 •

edited

Loading