Skip to content

feat!: format shell directives using shfmt#295

Merged
mbhall88 merged 13 commits into
snakemake:masterfrom
mbhall88:feat/shfmt
May 21, 2026
Merged

feat!: format shell directives using shfmt#295
mbhall88 merged 13 commits into
snakemake:masterfrom
mbhall88:feat/shfmt

Conversation

@mbhall88
Copy link
Copy Markdown
Member

@mbhall88 mbhall88 commented May 13, 2026

Summary

  • Formats the body of shell: directives using shfmt, enabled by default (-f/--format-shell, opt-out with -F/--no-format-shell or format_shell = false in pyproject.toml)
  • Snakemake {var} placeholders are masked before formatting and restored verbatim; escaped {{...}} braces are passed through unchanged
  • Invalid shell raises InvalidShell rather than crashing silently; escape hatches are -F or # fmt: off
  • Ships via shfmt-py — no separate user install required; distribution decision documented in ADR-0001
  • Fixes a bug where format_python_string_literal silently no-oped in real runs due to a trailing newline appended by the parser after the closing triple-quote
  • Exposes format_shell option in the Neovim plugin (require("snakefmt").setup({ format_shell = false }))

Closes #170

Test plan

  • Parametrised robustness suite in tests/test_shell_formatter.py covering all trailing-whitespace input shapes (trailing \n, multiple \n, spaces/tabs, CRLF, string prefix variants f, rb, B, u)
  • End-to-end behavioural tests in tests/test_formatter.py::TestShellBlockFormatting: default-on formats, format_shell=False disables, # fmt: off opts out, {var}/{{escaped}} placeholders survive the full pipeline
  • Config-path tests in tests/test_config.py::TestShellFormattingConfig: format_shell = false/true in TOML via CLI --config
  • uv run pytest — 288 passed, 1 xfailed

Summary by CodeRabbit

  • New Features

    • Shell blocks are auto-formatted with shfmt by default; toggle via CLI flag (--format-shell / --no-format-shell, -f/-F) or config.
  • Behavioral Improvements

    • Improved heredoc handling and preservation of Snakemake placeholders and escaped braces; parse failures surface as an InvalidShell error unless formatting is disabled or scoped off.
  • Documentation

    • README, ADRs, and plans updated with examples, CLI help, config defaults, and a new "Shell Block Formatting" section (collapsible help and examples).
  • Tests

    • Expanded tests covering shell formatting, Python-string literal handling, heredocs, masking, and config-driven behavior.

Review Change Stack

mbhall88 added 3 commits May 13, 2026 12:05
Add shfmt-py as dependency to format Snakemake shell directives.
Mask variables like {input} to prevent shfmt mangling.
Ignore escaped braces like {{ }} during masking.
Extract and format multiline string literals safely.
Add --no-format-shell CLI flag to opt-out.

Shell_formatter internals are split into focused helpers
(_mask_snakemake_vars, _invoke_shfmt, _unmask_snakemake_vars) to give
future backend swaps a clean seam — see
docs/adr/0001-shell-formatter-distribution.md.

Fixes a bug where format_python_string_literal received literals with a
trailing newline from the parser (a NEWLINE token after the closing
triple-quote), causing re.fullmatch to return the string unchanged and
leaving shell blocks silently unformatted in all real snakefmt runs.
Fix: rstrip the literal before regex matching.

Adds a parametrised robustness suite covering all trailing-whitespace
input shapes, and end-to-end behavioural tests through the public
formatter and CLI.

Closes snakemake#170
- docs/adr/README.md: index for architecture decision records
- docs/adr/0001-shell-formatter-distribution.md: records the decision to
  keep shfmt-py rather than forking or building Go bindings, with concrete
  revisit triggers
- docs/plans/shell-formatting-trailing-newline-bugfix.md: working plan for
  the trailing-newline bug identified and fixed in the previous commit
- README: new Shell Block Formatting section with before/after example,
  opt-out instructions, placeholder masking and invalid-shell notes; updated
  TOC, --help block, configuration example (sort_directives, format_shell),
  pre-commit rev (v0.10.2 -> v1.1.0), and Recent Changes callout
Adds format_shell = nil to config defaults. When set to false, --no-format-shell
is passed to the snakefmt binary, allowing users to disable shell block
formatting from their editor config without touching pyproject.toml.

  require("snakefmt").setup({ format_shell = false })

nil (default) passes no flag, deferring to snakefmt's own default (on).
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds shfmt-driven formatting for Snakemake shell bodies with placeholder masking, heredoc-aware masking/indentation preservation, a new InvalidShell error path, Formatter/CLI/Lua plumbing to opt in/out, comprehensive tests, ADRs/plans, and README documentation updates.

Changes

Shell Block Formatting via shfmt

Layer / File(s) Summary
ADRs, plans, and review docs
docs/adr/0001-shell-formatter-distribution.md, docs/adr/README.md, docs/plans/*, docs/plans/pr-295-review-response.md
Adds ADR and multiple planning/review documents describing distribution choice, heredoc masking plan, trailing-newline bugfix plan, reviewer response, test matrix, and execution checklist.
CLI, Formatter, and Lua wiring
snakefmt/snakefmt.py, snakefmt/formatter.py, lua/snakefmt/config.lua, lua/snakefmt/init.lua, tests/__init__.py
Introduces --format-shell/--no-format-shell CLI flag and main parameter, adds format_shell constructor arg and keyword_name plumbing in Formatter, exposes Lua format_shell option and passes --no-format-shell when configured, and threads format_shell through test helpers.
Shell formatter core and utilities
snakefmt/shell_formatter.py, snakefmt/exceptions.py
Implements Snakemake-placeholder masking/unmasking, heredoc masking/unmasking, _indent_preserving_heredocs, _invoke_shfmt (raises InvalidShell on shfmt failures), format_shell_code, and format_python_string_literal for triple-quoted Python literals; defines TAB.
Tests: unit and integration
tests/test_shell_formatter.py, tests/test_formatter.py, tests/test_config.py
Adds comprehensive tests covering python-literal formatting, shell-code canonicalization, heredoc handling and terminator alignment, masking round-trips, error cases (InvalidShell/unclosed heredoc), Formatter integration, and CLI/config-driven behavior.
README updates
README.md
Documents Shell Block Formatting section, updates help text and TOC, shows examples and opt-out instructions, wraps several sections in collapsible blocks, and updates config/pre-commit snippets.
sequenceDiagram
  participant PythonLiteral
  participant MaskSnakemakeVars
  participant MaskHeredocs
  participant Shfmt
  participant UnmaskHeredocs
  participant UnmaskSnakemakeVars
  participant FormatterOutput

  PythonLiteral->>MaskSnakemakeVars: replace `{...}`/`{{...}}` with tokens
  MaskSnakemakeVars->>MaskHeredocs: detect and replace heredoc bodies with placeholders
  MaskHeredocs->>Shfmt: invoke shfmt subprocess with masked code
  Shfmt->>UnmaskHeredocs: return formatted masked code
  UnmaskHeredocs->>UnmaskSnakemakeVars: restore original heredoc bodies and tokens
  UnmaskSnakemakeVars->>FormatterOutput: produce final formatted shell content
Loading

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.51% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main feature: formatting shell directives using shfmt, matching the primary changeset objective.
Linked Issues check ✅ Passed All objectives from issue #170 are met: shfmt integration implemented via shfmt-py, Snakemake placeholders preserved through masking, and shell formatting enables standardization to prevent spurious reruns.
Out of Scope Changes check ✅ Passed All changes directly support the shell formatting feature: CLI/config options, formatter plumbing, shell_formatter implementation, documentation (ADRs, plans, README), Lua plugin exposure, test coverage, and a bugfix for trailing newlines in Python string literals parsing.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2026

Codecov Report

❌ Patch coverage is 97.50000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.59%. Comparing base (40d68de) to head (1abcda3).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
snakefmt/shell_formatter.py 97.34% 1 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #295      +/-   ##
==========================================
+ Coverage   96.52%   96.59%   +0.07%     
==========================================
  Files          12       13       +1     
  Lines        1497     1617     +120     
  Branches      309      324      +15     
==========================================
+ Hits         1445     1562     +117     
- Misses         24       25       +1     
- Partials       28       30       +2     
Flag Coverage Δ
unittests 96.53% <97.50%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
snakefmt/exceptions.py 100.00% <100.00%> (ø)
snakefmt/formatter.py 98.43% <100.00%> (+0.01%) ⬆️
snakefmt/snakefmt.py 95.27% <100.00%> (+0.03%) ⬆️
snakefmt/shell_formatter.py 97.34% <97.34%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@snakefmt/shell_formatter.py`:
- Around line 15-35: The mask template is globally fixed and can collide with
user text; change _mask_snakemake_vars to generate a per-call unique token
prefix (e.g., random/UUID-based string or monotonic nonce) and store the exact
tokens used in the returned originals tuple (or a separate tokens list), then
change _unmask_snakemake_vars to iterate in reverse and replace those exact
tokens with their originals using the stored tokens; update references to
_MASK_TEMPLATE in both _mask_snakemake_vars and _unmask_snakemake_vars (and the
similar code at lines ~56-58) so masking/unmasking uses the per-call tokens
rather than a global static format.
- Line 13: The current _SNAKEMAKE_VAR_PATTERN is too permissive and captures
arbitrary { ... } blocks (including shell constructs like fn(){...}); tighten it
so it only matches placeholders whose content begins with a letter or underscore
and consists solely of identifier-like characters (letters, digits, underscore
and limited punctuation such as dot/dash/colon) and explicitly disallow
characters used in shell blocks (parentheses, semicolons, whitespace, braces,
etc.), while preserving the existing negative lookarounds that prevent matching
double-brace sequences; update the _SNAKEMAKE_VAR_PATTERN definition accordingly
and adjust any tests that relied on the old behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e3e40cde-899c-44d4-bf9e-8fad742e65cc

📥 Commits

Reviewing files that changed from the base of the PR and between 6990f79 and aeae311.

⛔ Files ignored due to path filters (2)
  • pyproject.toml is excluded by !pyproject.toml
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • README.md
  • docs/adr/0001-shell-formatter-distribution.md
  • docs/adr/README.md
  • docs/plans/shell-formatting-trailing-newline-bugfix.md
  • lua/snakefmt/config.lua
  • lua/snakefmt/init.lua
  • snakefmt/exceptions.py
  • snakefmt/formatter.py
  • snakefmt/shell_formatter.py
  • snakefmt/snakefmt.py
  • tests/test_config.py
  • tests/test_formatter.py
  • tests/test_shell_formatter.py

Comment thread snakefmt/shell_formatter.py Outdated
Comment thread snakefmt/shell_formatter.py Outdated
…vars

After the UUID nonce refactor, _mask_snakemake_vars returns (masked, tokens,
originals) but format_shell_code still unpacked a 2-tuple. Update the call
site to match the new signature.
@mbhall88 mbhall88 requested a review from Hocnonsense May 13, 2026 04:48
@mbhall88
Copy link
Copy Markdown
Member Author

@Hocnonsense it would be nice to get a review from you before merging this as it is a big feature addition and you and I are the most recently active members on this repository.
@bricoletc if you have the bandwidth a review from you would also be great.
And @johanneskoester and @dlaehnemann as always, any comments would be most valuable.

Comment thread snakefmt/snakefmt.py Outdated
Comment thread snakefmt/shell_formatter.py Outdated
Copy link
Copy Markdown
Contributor

@Hocnonsense Hocnonsense left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution. The CLI/config plumbing looks solid, and the overall architecture is clean. I do have some concerns before this can merge, one of which is a blocker.

Comment thread snakefmt/shell_formatter.py Outdated
Comment thread snakefmt/shell_formatter.py Outdated
Comment thread tests/test_shell_formatter.py
Comment thread tests/test_formatter.py

The unformatted input uses an if/then/fi block that shfmt normalises to
`if ...; then` form — a visible, deterministic transformation we can assert on.
"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. More test about cases where too many/too few indents are necessary.

  2. A basic heredoc to verify the terminator lands at column 0 after formatting.

  3. A case with embedded \n escape sequences and blank lines, e.g.:

    shell: """
        python <<!EOF!
        \nif True:
    
        pass
    
        \n!EOF!
    """
    

    This covers shfmt's leave-heredoc-body-alone behaviour and confirms textwrap.indent
    does not corrupt the terminator or the blank lines.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully the new tests I've added cover these cases sufficiently for you?

Comment thread snakefmt/shell_formatter.py
mbhall88 added 2 commits May 14, 2026 10:55
- C1: unified two-layer brace masking — {{...}} is now masked before
  single {var} placeholders, ensuring brace groups, parameter expansion,
  brace expansion, awk patterns, and f-string Snakemake vars all survive
  shfmt unchanged

- B2: replace textwrap.indent with _indent_preserving_heredocs, which
  skips heredoc body and terminator lines so <<EOF terminators remain at
  column 0 as bash requires

- T1/C2: add unit tests for _indent_preserving_heredocs, integration
  heredoc tests with concrete expected values, and a long-line
  no-spurious-wrap test confirming shfmt's flags are syntactic only

- M1: move format_shell from imperative post-construction attribute into
  Formatter.__init__ (default False, matching sort_directives); update
  setup_formatter and all TestShellBlockFormatting tests to opt in
  explicitly with format_shell=True

- S1: consolidate two re.fullmatch calls into a single compiled
  _TRIPLE_QUOTE_RE using "{3}|'{3} to avoid backslash escapes in raw
  strings that confuse black's parser

- README: add "Brace groups" section documenting the {{...}} preservation
  trade-off and the fmt: off escape hatch
v4.0.0 bundles shfmt v3.13.1 (latest upstream), decouples the package
version from the shfmt version, adds Renovate for automated future
updates, and falls back to a system shfmt on unsupported platforms.

Updates ADR-0001 to record that the maintenance concerns that prompted
the original decision have been addressed.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
tests/test_shell_formatter.py (2)

48-51: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Parametrized f-string test cases use incorrect syntax.

Lines 48-51 test string prefixes including f, rb, B, and u, but all use {_UNFORMATTED_CONTENT} which contains the variable in single-brace form. For the prefix_f case, this represents a Python f-string expression, not a Snakemake placeholder, and would fail at runtime if _UNFORMATTED_CONTENT isn't defined as a Python variable.

The test at lines 150-157 correctly demonstrates f-string usage with {{output}}, showing awareness of the issue.

🔧 Suggested fix

Remove or update the f-string parametrized case:

         (f'f"""\n{_UNFORMATTED_CONTENT}\n"""\n', f'f"""\n{_EXPECTED_CONTENT}"""'),
-        (f'rb"""\n{_UNFORMATTED_CONTENT}\n"""\n', f'rb"""\n{_EXPECTED_CONTENT}"""'),
+        (f'r"""\n{_UNFORMATTED_CONTENT}\n"""\n', f'r"""\n{_EXPECTED_CONTENT}"""'),
         (f'B"""\n{_UNFORMATTED_CONTENT}\n"""\n', f'B"""\n{_EXPECTED_CONTENT}"""'),
         (f'u"""\n{_UNFORMATTED_CONTENT}\n"""\n', f'u"""\n{_EXPECTED_CONTENT}"""'),
     ],
     ids=[
         "trailing_newline",
         "multiple_trailing_newlines",
         "trailing_spaces_tabs",
         "crlf_line_endings",
-        "prefix_f",
-        "prefix_rb",
+        "prefix_r",
         "prefix_B",
         "prefix_u",

Or replace the fixture with f-string-safe content (no Snakemake placeholders).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_shell_formatter.py` around lines 48 - 51, The parametrized test
cases use Python f-string prefixes consuming {_UNFORMATTED_CONTENT} (seen in the
tuples with f'...{_UNFORMATTED_CONTENT}...'), which will be evaluated at test
runtime; update the test in tests/test_shell_formatter.py to avoid evaluating
Snakemake placeholders: either remove the f-prefixed tuple (the one using
f'"""...{_UNFORMATTED_CONTENT}..."""') or replace its content with an
f-string-safe fixture (e.g., use literal double-braces like '{{output}}' or
plain text without {…} placeholders) so that
_UNFORMATTED_CONTENT/_EXPECTED_CONTENT remain valid for the shell formatter
assertions.

19-21: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Test uses incorrect Snakemake syntax for f-strings.

This test uses {input} inside an f-string shell block, but in f-strings, {input} is a Python expression (evaluated before the shell runs), not a Snakemake placeholder. The correct syntax for Snakemake variables in f-strings is {{input}} (which Python's str.format() collapses to {input} at DAG build time).

This test validates the current implementation but tests incorrect behavior. Once the f-string handling issue in shell_formatter.py (line 116-153) is addressed, this test should be updated or removed.

🔧 Suggested fix

Either skip f-strings in the test:

 def test_format_python_string_literal():
-    literal = 'f"""\n      echo {input}\n    ls -l\n    """'
-    expected = 'f"""\n        echo {input}\n        ls -l\n        """'
+    literal = '"""\n      echo {input}\n    ls -l\n    """'
+    expected = '"""\n        echo {input}\n        ls -l\n        """'
     assert format_python_string_literal(literal, target_indent=2) == expected

Or use correct f-string syntax:

-    literal = 'f"""\n      echo {input}\n    ls -l\n    """'
-    expected = 'f"""\n        echo {input}\n        ls -l\n        """'
+    literal = 'f"""\n      echo {{input}}\n    ls -l\n    """'
+    expected = 'f"""\n        echo {{input}}\n        ls -l\n        """'

Based on learnings from past review: The parametrized test suite includes f-string cases using single-brace {input} syntax, which is not valid for Snakemake shell blocks in f-strings.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_shell_formatter.py` around lines 19 - 21, The test in
tests/test_shell_formatter.py uses an invalid Snakemake f-string placeholder
`{input}`; update the test for format_python_string_literal to either skip
f-string cases or use the correct Snakemake-safe f-string escaping `{{input}}`
and adjust the expected string accordingly so the formatter is validated against
correct syntax; refer to the f-string handling code in shell_formatter.py (the
f-string handling block around lines 116-153) and the test helper
format_python_string_literal to locate and update the failing case.
snakefmt/shell_formatter.py (1)

116-153: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Critical: f-string shell blocks will be incorrectly formatted.

format_python_string_literal applies the same {var} masking logic to all string prefixes, but f-strings have inverted brace semantics:

Syntax Plain/raw string f-string
{input} Snakemake variable Python expression (evaluated before shell runs)
{{input}} Literal {input} passed to shell Snakemake variable (Python expands {{}}{})

This means:

  • {input} in an f-string will be masked as a Snakemake variable, but it's actually a Python expression that should reach shfmt as-is (or error if the variable doesn't exist)
  • {{input}} is the correct way to write a Snakemake variable in an f-string, but the negative lookbehind in line 19 prevents matching it

Shell blocks in f-strings will be mangled unless users avoid f-strings entirely for shell directives.

🛡️ Suggested fix: Skip formatting for f-strings
 def format_python_string_literal(literal: str, target_indent: int = 0) -> str:
     """Extracts shell code from a Python string literal, formats it, and re-wraps it.
 
     Only multiline triple-quoted strings are formatted. Single-line strings and any
     form that doesn't match a simple triple-quote pattern are returned unchanged.
     The literal is expected to arrive from `str(parameter)` — the caller guarantees
     no leading whitespace beyond what the parser appends (handled by `rstrip` below).
     """
     # Strip trailing whitespace: the parser appends a NEWLINE token after the
     # closing triple-quote, so the literal arrives as '"""..."""\n'. The caller
     # already rstrips the result, so this is safe.
     literal = literal.rstrip()
     match = _TRIPLE_QUOTE_RE.fullmatch(literal)
 
     if not match:
         return literal  # not a triple-quoted string literal; return unchanged
 
     prefix = match.group(1)
+    
+    # f-strings have inverted brace semantics; skip formatting to avoid mangling
+    if "f" in prefix.lower():
+        return literal
+    
     quote = match.group(2)
     content = match.group(3)

Based on learnings from past review: f-strings require {{var}} for Snakemake placeholders, but the current masking pattern with negative lookbehind (?<!\{) explicitly excludes {{...}}, causing the formatter to produce invalid shell code for f-string shell blocks.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@snakefmt/shell_formatter.py` around lines 116 - 153, The function
format_python_string_literal currently treats all triple-quoted strings the
same, which mangles f-strings; change it to skip formatting when the string is
an f-string: after extracting prefix = match.group(1) (and before calling
format_shell_code), detect if the prefix contains an f or F (e.g., "f" in
prefix.lower() or use any(c in prefix for c in "fF")) and if so return the
original literal unchanged; keep the existing formatting path for non-f prefixes
and leave _TRIPLE_QUOTE_RE and format_shell_code untouched.
🧹 Nitpick comments (2)
tests/test_shell_formatter.py (1)

110-110: 💤 Low value

Rename vars to avoid shadowing the builtin.

The variable name vars shadows Python's builtin vars() function. While unlikely to cause issues in this test context, it reduces clarity and triggers linter warnings.

♻️ Suggested rename
-    vars = [f"{{var{i}}}" for i in range(15)]
-    unformatted = "echo " + " ".join(vars) + "\n"
-    expected = "echo " + " ".join(vars) + "\n"
+    placeholders = [f"{{var{i}}}" for i in range(15)]
+    unformatted = "echo " + " ".join(placeholders) + "\n"
+    expected = "echo " + " ".join(placeholders) + "\n"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_shell_formatter.py` at line 110, The test creates a list assigned
to the name `vars` which shadows the builtin vars(); rename that variable (e.g.,
to `var_placeholders` or `vars_list`) wherever it is used in the test to avoid
shadowing and linter warnings—update the list comprehension assignment (`vars =
[f"{{var{i}}}" for i in range(15)]`) and all subsequent references to the new
name (look for usages in the test function in tests/test_shell_formatter.py).
snakefmt/shell_formatter.py (1)

56-56: ⚡ Quick win

Consider explicit strict= for zip() to catch logic errors.

If tokens and originals have mismatched lengths due to a bug in masking/unmasking logic, the silent truncation could corrupt shell code. Adding strict=True (Python 3.10+) makes the error explicit.

♻️ Proposed addition
-    for token, original in zip(reversed(tokens), reversed(originals)):
+    for token, original in zip(reversed(tokens), reversed(originals), strict=True):
         code = code.replace(token, original)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@snakefmt/shell_formatter.py` at line 56, The loop using zip(reversed(tokens),
reversed(originals)) can silently truncate if tokens and originals lengths
mismatch; update the call to zip with strict=True (i.e., zip(..., strict=True))
to raise an error on length mismatch so masking/unmasking bugs fail fast; change
the loop in shell_formatter.py that iterates over reversed(tokens) and
reversed(originals) to use zip(..., strict=True) and run tests to ensure
compatibility with Python 3.10+.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@snakefmt/shell_formatter.py`:
- Line 73: The current exception handling raises InvalidShell while suppressing
the original CalledProcessError traceback; update the raise to chain the
original exception (use `raise InvalidShell(f"Invalid shell code. shfmt error:
{e.stderr}") from e`) so the original CalledProcessError and its context
(missing binary, permission, etc.) is preserved—locate the raise of InvalidShell
inside the shell formatting code that catches CalledProcessError (referencing
the variable `e`) and replace the `from None` suppression with `from e`.

---

Duplicate comments:
In `@snakefmt/shell_formatter.py`:
- Around line 116-153: The function format_python_string_literal currently
treats all triple-quoted strings the same, which mangles f-strings; change it to
skip formatting when the string is an f-string: after extracting prefix =
match.group(1) (and before calling format_shell_code), detect if the prefix
contains an f or F (e.g., "f" in prefix.lower() or use any(c in prefix for c in
"fF")) and if so return the original literal unchanged; keep the existing
formatting path for non-f prefixes and leave _TRIPLE_QUOTE_RE and
format_shell_code untouched.

In `@tests/test_shell_formatter.py`:
- Around line 48-51: The parametrized test cases use Python f-string prefixes
consuming {_UNFORMATTED_CONTENT} (seen in the tuples with
f'...{_UNFORMATTED_CONTENT}...'), which will be evaluated at test runtime;
update the test in tests/test_shell_formatter.py to avoid evaluating Snakemake
placeholders: either remove the f-prefixed tuple (the one using
f'"""...{_UNFORMATTED_CONTENT}..."""') or replace its content with an
f-string-safe fixture (e.g., use literal double-braces like '{{output}}' or
plain text without {…} placeholders) so that
_UNFORMATTED_CONTENT/_EXPECTED_CONTENT remain valid for the shell formatter
assertions.
- Around line 19-21: The test in tests/test_shell_formatter.py uses an invalid
Snakemake f-string placeholder `{input}`; update the test for
format_python_string_literal to either skip f-string cases or use the correct
Snakemake-safe f-string escaping `{{input}}` and adjust the expected string
accordingly so the formatter is validated against correct syntax; refer to the
f-string handling code in shell_formatter.py (the f-string handling block around
lines 116-153) and the test helper format_python_string_literal to locate and
update the failing case.

---

Nitpick comments:
In `@snakefmt/shell_formatter.py`:
- Line 56: The loop using zip(reversed(tokens), reversed(originals)) can
silently truncate if tokens and originals lengths mismatch; update the call to
zip with strict=True (i.e., zip(..., strict=True)) to raise an error on length
mismatch so masking/unmasking bugs fail fast; change the loop in
shell_formatter.py that iterates over reversed(tokens) and reversed(originals)
to use zip(..., strict=True) and run tests to ensure compatibility with Python
3.10+.

In `@tests/test_shell_formatter.py`:
- Line 110: The test creates a list assigned to the name `vars` which shadows
the builtin vars(); rename that variable (e.g., to `var_placeholders` or
`vars_list`) wherever it is used in the test to avoid shadowing and linter
warnings—update the list comprehension assignment (`vars = [f"{{var{i}}}" for i
in range(15)]`) and all subsequent references to the new name (look for usages
in the test function in tests/test_shell_formatter.py).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2272da25-0e77-480e-afca-d0198bb6eae0

📥 Commits

Reviewing files that changed from the base of the PR and between aeae311 and c39c06c.

⛔ Files ignored due to path filters (2)
  • pyproject.toml is excluded by !pyproject.toml
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • README.md
  • docs/adr/0001-shell-formatter-distribution.md
  • docs/plans/pr-295-review-response.md
  • snakefmt/formatter.py
  • snakefmt/shell_formatter.py
  • snakefmt/snakefmt.py
  • tests/__init__.py
  • tests/test_formatter.py
  • tests/test_shell_formatter.py
✅ Files skipped from review due to trivial changes (1)
  • docs/plans/pr-295-review-response.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs/adr/0001-shell-formatter-distribution.md
  • tests/test_formatter.py

Comment thread snakefmt/shell_formatter.py Outdated
Preserves the original exception context so tracebacks show the full
cause (missing binary, permission error, etc.) rather than only the
InvalidShell message.
@mbhall88
Copy link
Copy Markdown
Member Author

Thanks for the awesome review @Hocnonsense

Feel free to go through the updates when you get a chance and let me know what you think.

@mbhall88 mbhall88 requested a review from Hocnonsense May 14, 2026 02:19
Copy link
Copy Markdown
Contributor

@Hocnonsense Hocnonsense left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates — the heredoc-aware indent and the two-layer {{...}} masking address the core structural issues from the first round. A few follow-up concerns remain; most are enhancements or edge cases rather than blockers, but there are two correctness gaps worth noting before merge.

Comment thread snakefmt/shell_formatter.py Outdated

**Decision: mask all `{{...}}` patterns** uniformly across plain strings and f-strings. The formatter does not need to model runtime semantics — it preserves every brace-bracketed token verbatim.

**Trade-off (accepted):** bash brace groups `{{ cmd1; cmd2; }}` lose internal formatting because shfmt sees only an opaque token. This case is rare in Snakemake shell directives (`&&` chains and line continuations cover most pipelines). Users who need brace-group formatting can use `# fmt: off`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within the opaque-mask design the current ordering ({{...}} first, then {var}) is a valid implementation choice — fixing C1a is sufficient to make it work.
However, reversing the order unlocks a cleaner pipeline that allows shfmt to actually format brace group bodies:

  1. Mask Snakemake {var} → tokens — invariant: no bare {…} remain after this step
  2. Unescape {{{ and }}} — safe, because any {…} here came from {{…}}
  3. Run shfmt on the actual shell code (brace groups visible and formattable)
  4. Re-escape all {{{ and }}} — unambiguous for the same reason
  5. Unmask tokens → {var}

The invariant from step 1 makes steps 2 and 4 unambiguous without any position tracking. Crucially, the output still contains {{...}}, so str.format() is unaffected at runtime.

The README currently states:

reformatting the body would require splitting the {{ / }} delimiters, which would break str.format() at runtime.

This is incorrect — the pipeline above does not split the delimiters; it round-trips them through unescape → re-escape and the final output is identical in structure to the input. The real trade-off is implementation complexity, not a technical constraint.
If the simpler opaque-mask approach is retained, the README note should be corrected to reflect the actual reason (complexity) rather than the stated one (impossibility).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README has been updated accordingly


`_unmask_snakemake_vars` already does the right thing — iterates `(token, original)` pairs in reverse and substitutes. No signature change.

**F-string handling:** with this fix, f-strings are formatted normally. No special-case skip. The reviewer's underlying concern (wrong masking semantics for f-strings) is resolved by the unified two-layer mask.
Copy link
Copy Markdown
Contributor

@Hocnonsense Hocnonsense May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only partially true. With {{...}} masked first, both {{output}} and {input} in an f-string are correctly masked as tokens ✓

However, f-string {...} pairing is determined by Python's parser, not by regex. Applying a single regex across both f-string expressions and Snakemake variables simultaneously risks incorrect matches when bracket pairs span both contexts — for example, {{func {params.a} and params: a= "balabala}}". Although rare, the regex cannot distinguish these without understanding Python syntax.

The safest fix remains skipping f-strings entirely: if "f" in prefix.lower(): return literal.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is addressed now with the new bracket masking strategy. We don't ignore f-strings, but we are more "aggressive" with the masking

Comment thread docs/plans/pr-295-review-response.md
Tighten double-brace regex to match non-greedily.
Broaden heredoc start regex to support custom delimiters like !EOF!.
Update README to explain opaque mask design choice accurately.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/test_shell_formatter.py (1)

110-110: ⚡ Quick win

Rename vars to avoid shadowing the builtin.

The variable name vars shadows Python's builtin vars() function. Consider renaming to var_list or variables.

♻️ Proposed fix
-    vars = [f"{{var{i}}}" for i in range(15)]
-    unformatted = "echo " + " ".join(vars) + "\n"
-    expected = "echo " + " ".join(vars) + "\n"
+    var_list = [f"{{var{i}}}" for i in range(15)]
+    unformatted = "echo " + " ".join(var_list) + "\n"
+    expected = "echo " + " ".join(var_list) + "\n"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_shell_formatter.py` at line 110, The test creates a list named
`vars` which shadows the built-in vars() function; rename that variable (e.g.,
to `var_list` or `variables`) everywhere it is used in the test (the list
comprehension that currently reads `vars = [f"{{var{i}}}" for i in range(15)]`
and any subsequent references) so the builtin is not shadowed and intent is
clearer.
snakefmt/shell_formatter.py (1)

56-56: ⚡ Quick win

Consider adding strict=True to zip() for defensive programming.

Adding strict=True would catch any bugs where tokens and originals have mismatched lengths, making debugging easier if the masking invariant is ever violated.

♻️ Proposed fix
-    for token, original in zip(reversed(tokens), reversed(originals)):
+    for token, original in zip(reversed(tokens), reversed(originals), strict=True):
         code = code.replace(token, original)
     return code
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@snakefmt/shell_formatter.py` at line 56, The loop using zip(reversed(tokens),
reversed(originals)) should use strict=True to detect mismatched lengths early;
change the iteration to zip(reversed(tokens), reversed(originals), strict=True)
so any length invariant violations between tokens and originals raise
immediately (refer to the reversed(tokens)/reversed(originals) zip in
snakefmt/shell_formatter.py and the variables tokens and originals).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_shell_formatter.py`:
- Line 240: The test's long string literal assigned to expected exceeds the
88-char limit; shorten the line by splitting the string into multiple shorter
string literals joined implicitly inside parentheses or using concatenation so
the assignment to expected (the long triple-quoted shell snippet) is broken
across lines and each line stays under 88 chars; update the test value in
tests/test_shell_formatter.py so the content is unchanged but the source line
lengths comply with flake8.

---

Nitpick comments:
In `@snakefmt/shell_formatter.py`:
- Line 56: The loop using zip(reversed(tokens), reversed(originals)) should use
strict=True to detect mismatched lengths early; change the iteration to
zip(reversed(tokens), reversed(originals), strict=True) so any length invariant
violations between tokens and originals raise immediately (refer to the
reversed(tokens)/reversed(originals) zip in snakefmt/shell_formatter.py and the
variables tokens and originals).

In `@tests/test_shell_formatter.py`:
- Line 110: The test creates a list named `vars` which shadows the built-in
vars() function; rename that variable (e.g., to `var_list` or `variables`)
everywhere it is used in the test (the list comprehension that currently reads
`vars = [f"{{var{i}}}" for i in range(15)]` and any subsequent references) so
the builtin is not shadowed and intent is clearer.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 88a4d78c-3e49-40eb-800d-f8c58d08c9d8

📥 Commits

Reviewing files that changed from the base of the PR and between c39c06c and e10bcbc.

📒 Files selected for processing (3)
  • README.md
  • snakefmt/shell_formatter.py
  • tests/test_shell_formatter.py

Comment thread tests/test_shell_formatter.py Outdated
@mbhall88
Copy link
Copy Markdown
Member Author

As always, great review @Hocnonsense. I think I have addressed these valid edge cases. But if I have missed anything I am happy to continue iterating on fixes. I want this feature to be well tested before it is released into the wild. Get as pedantic as you want.

@mbhall88 mbhall88 requested a review from Hocnonsense May 14, 2026 22:54
Copy link
Copy Markdown
Contributor

@Hocnonsense Hocnonsense left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is how snakefmt run on my snakefile, and I've seen some point:

  1. without # fmt: off, shfmt can fail:
    $ uv run snakefmt -d gene_arg.smk
    subprocess.CalledProcessError: Command '['shfmt', '-i', '4', '-ci', '-bn']' returned non-zero exit status 1.
    
    The above exception was the direct cause of the following exception:
    
    snakefmt.exceptions.InvalidShell: Invalid shell code. shfmt error: <standard input>:18:16: unclosed here-document `!EOF!`
    
  2. indent of the docstring changes (probably out of scope).
  3. code should be formatted to run full tests

$ uv run snakefmt -d gene_arg.smk
=====> Diff for gene_arg.smk <=====

  rule fq_ARGs_OAP:
      input:
          r1="{any}_1.fq.gz",
          r2="{any}_2.fq.gz",
      output:
          tsv=["{any}-args_oap" f"_{i}.tsv" for i in ["type", "subtype", "gene"]],
      shadow:
          "shallow"
      conda:
          "gene_arg.yaml"
      threads: 8
      params:
          suffix="fq",  # suffix should exclude prefix "." and can ignore suffix ".gz"
          data_type=["type", "subtype", "gene"],
          outfolder="{any}-args_oap",
      # fmt: off
      shell:
          """
          rm -f smk-args_oap
          mkdir smk-args_oap/{{input,output}} -p
  
          ln -rs {input} smk-args_oap/input
  
          args_oap stage_one \
              -i smk-args_oap/input -o smk-args_oap/output \
              -f {params.suffix} -t {threads}
          args_oap stage_two \
              -i smk-args_oap/output -t {threads}
  
          if [ -f {params.outfolder} ]; then
              mv {params.outfolder} smk-args_oap/old_output
          fi
          mv smk-args_oap/output {params.outfolder}
  
          python <<!EOF!
          \nif True:
  
          from pathlib import Path
          import pandas as pd
  
          txt_files = (
              pd.Series(Path("{params.outfolder}").rglob("*.txt"))
              .rename("path")
              .to_frame()
              .assign(
                  Name=lambda df: df["path"].apply(lambda p: p.name),
                  Index=lambda df: df["Name"].apply(lambda p: p.rsplit(".", 2)[0]),
                  Type=lambda df: df["Name"].apply(lambda p: p.rsplit(".", 2)[1]),
              )
          )
          out_types = "{params.data_type}".split()
          metadata = (
              txt_files.pipe(lambda df: df[df["Type"].apply(lambda t: t in out_types)])
              .pivot_table(values="path", index="Index", columns="Type", aggfunc=set)
              .explode(out_types)
              .dropna()
          )
          for data_type, outfile in zip(out_types, "{output.tsv}".split()):
              concat: pd.DataFrame = pd.concat(
                  metadata[data_type]
                  .apply(lambda x: pd.read_csv(x, sep="\t").set_index(data_type))
                  .to_dict()
              )
              norm = (
                  concat.reset_index()
                  .melt(id_vars=["level_0", data_type], var_name="Sample")
                  .pivot_table(
                      values="value", index=[data_type, "Sample"], columns="level_0"
                  )
              )
              norm.to_csv(outfile, sep="\t")
  
          \n!EOF!
          """
  
  
  rule faa_rgi:
      """
-     [home page](https://card.mcmaster.ca/)
? ----
+ [home page](https://card.mcmaster.ca/)
-     this software can also annotate from contig, but it should call prodigal to predict itself
? ----
+ this software can also annotate from contig, but it should call prodigal to predict itself
-     """
+ 
+ """
      input:
          faa="{prefix}.faa",
      output:
          tsv="{prefix}-rgi_prot.tsv",
      shadow:
          "shallow"
      conda:
          "rgi"  # mamba create -n rgi --channel conda-forge --channel bioconda --channel defaults rgi
      threads: 8
      shell:
          """
          rm -f smk-rgi
          mkdir -p smk-rgi/
  
          rgi main \
              -i {input.faa} -o smk-rgi/output \
              -t protein \
              --clean --num_threads {threads}
          """
  
  
  rule download_ISFinder:
      output:
          faa="{database}/ISfinder-sequences/IS.faa",
          fna="{database}/ISfinder-sequences/IS.fna",
          csv="{database}/ISfinder-sequences/IS.csv",
      conda:
          "gene_arg.yaml"
      params:
          outdir="{database}",
          reposity="https://github.com/thanhleviet",
      shell:
          """
          mkdir {params.outdir} -p
          cd {params.outdir}
          /bin/rm ISfinder-sequences -r
          git clone {params.reposity}/ISfinder-sequences
          """
  
  
  rule download_VFDB_fna:
      output:
          fas="{database}/VFDB/VFDB_core.fna",
      conda:
          "gene_arg.yaml"
      params:
          outdir="{database}",
          homepage="http://www.mgc.ac.cn/VFs",
          file="VFDB_setA_nt.fas.gz",
      shell:
          """
          (
-             cd `dirname {output.fas}`
?                ^                    ^
+             cd $(dirname {output.fas})
?                ^^                    ^
              curl {params.homepage}/Down/{params.file} \
-             |gunzip -c \
+                 | gunzip -c \
? ++++             +
-             > `basename {output.fas}`
?              ^^                     ^
+                     >$(basename {output.fas})
? ++++++++             ^^                     ^
          )
          """

[INFO] All done 🎉

mbhall88 added 2 commits May 15, 2026 15:11
…ators

shfmt requires heredoc terminators at column 0 (or leading tabs for
<<-EOF). Snakemake shell blocks sometimes use escape-prefixed terminators
like \n!EOF!, which shfmt rejects as unclosed heredocs.

Pre-process masked code with _mask_heredocs before invoking shfmt:
detect the user-intended terminator permissively (accepting whitespace
and \X escape prefixes before the delimiter word), replace the body and
terminator with a placeholder, let shfmt format the surrounding shell,
then restore the original body and terminator verbatim via _unmask_heredocs.

shfmt does not reformat heredoc body content anyway, so masking has no
effect on formatting quality. Five new tests cover the reviewer's exact
case, body preservation, multiple heredocs, surrounding shell formatting,
and truly unclosed heredocs that should still raise InvalidShell.

Also updates the README with a dedicated "Heredoc handling" section.
@mbhall88
Copy link
Copy Markdown
Member Author

@Hocnonsense, snakefmt now pre-processes shell code before invoking shfmt, detecting the user-intended terminator permissively (accepting leading whitespace and \X escape sequences like \n, \t before the delimiter word), replacing the body and terminator with a placeholder, letting shfmt format the surrounding shell, then restoring the original body and terminator verbatim. This is safe because shfmt never reformats heredoc body content anyway. Though truly unclosed heredocs (no recognisable terminator anywhere) still raise InvalidShell as expected.

I have tested your exact example locally and it seems to work all okay now. Test it out and see what you think.

I also added some extra new tests with your case, body preservation, multiple heredocs, surrounding shell formatting, and the unclosed-heredoc error path.

@mbhall88 mbhall88 requested a review from Hocnonsense May 15, 2026 05:26
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/plans/pr-295-heredoc-masking.md`:
- Around line 11-19: The Markdown uses bare fenced code blocks that trigger
MD040; update the first exception block containing
"snakefmt.exceptions.InvalidShell: Invalid shell code. shfmt error:" to use a
language identifier (e.g., replace the opening ``` with ```text) and update the
second block that starts with "shell:" and contains the heredoc/`python <<!EOF!`
snippet to use a language identifier (e.g., replace the opening ``` with
```python) so both fenced blocks include explicit languages.

In `@README.md`:
- Around line 418-426: The code block showing the heredoc for the shell/python
example uses an unlabeled fenced block which triggers markdownlint MD040; update
the opening fence to include the language by changing the triple backticks to
use "python" (i.e., replace ``` with ```python) for the block that contains the
"shell:" heredoc and the "python <<!EOF!" content so the block is properly
language-labeled.

In `@snakefmt/shell_formatter.py`:
- Around line 194-199: The code currently calls literal.rstrip() before doing
_TRIPLE_QUOTE_RE.fullmatch, so the no-match branch returns the trimmed value
instead of the original; preserve the exact original input by either performing
the regex match on the unmodified literal (move the rstrip after matching) or
store the original in a temp (e.g., original = literal) then use trimmed =
literal.rstrip() for matching/processing and return original when match is None;
update the logic around literal, match, and _TRIPLE_QUOTE_RE.fullmatch
accordingly.

In `@tests/test_shell_formatter.py`:
- Around line 108-113: The test function test_format_shell_code_many_variables
shadows the builtin vars by using the variable name vars; rename that variable
(e.g., to variables or vars_list) and update all uses within the function (the
list comprehension, unformatted, and expected constructions) to the new name so
Ruff A001 is resolved and behavior remains unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5bccfc87-7df3-4521-ac98-dc6339ac8e45

📥 Commits

Reviewing files that changed from the base of the PR and between e10bcbc and 2ebe878.

📒 Files selected for processing (4)
  • README.md
  • docs/plans/pr-295-heredoc-masking.md
  • snakefmt/shell_formatter.py
  • tests/test_shell_formatter.py

Comment thread docs/plans/pr-295-heredoc-masking.md
Comment thread README.md
Comment thread snakefmt/shell_formatter.py Outdated
Comment thread tests/test_shell_formatter.py
Copy link
Copy Markdown
Contributor

@Hocnonsense Hocnonsense left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow it's indeed a really nice solution. LGTM!

Thanks for your great effort!

mbhall88 and others added 2 commits May 15, 2026 20:02
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…s no triple-quote match

rstrip was applied before the no-match early return, so single-line
strings and other non-triple-quoted literals were returned trimmed
rather than verbatim. Store the original before rstripping and return
it on the no-match path.
@mbhall88
Copy link
Copy Markdown
Member Author

Wow it's indeed a really nice solution. LGTM!

Thanks for your great effort!

Great!

I'll wait and see if any of the others want to review. If not, I'll push this sometime next week.

@mbhall88 mbhall88 merged commit a8d131b into snakemake:master May 21, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Format shell blocks using shfmt

2 participants