-
Notifications
You must be signed in to change notification settings - Fork 38
fix: time conversion for efficiency reports with jobs taking longer than 1 day #362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughReplaces manual colon-splitting with iterative datetime-format parsing in time_to_seconds to handle SLURM sacct variants (days, HH:MM:SS, MM:SS, SS, fractional seconds), trimming whitespace and returning 0 for empty/NA/invalid inputs; adds tests covering these formats and edge cases. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as tests/tests.py
participant Eff as efficiency_report.time_to_seconds
participant Parser as datetime.strptime
Test->>Eff: call time_to_seconds(input_str)
Eff->>Eff: trim & normalize input
alt input empty / NA / invalid
Eff-->>Test: return 0
else
loop try candidate formats
Eff->>Parser: attempt parse with format (D-HH:MM:SS(.ffffff) / HH:MM:SS(.ffffff) / MM:SS(.ffffff) / SS(.ffffff))
Parser-->>Eff: parsed components or error
end
alt parsed
Eff->>Eff: compute total_seconds = days*86400 + h*3600 + m*60 + s + fractional
Eff-->>Test: return total_seconds
else
Eff-->>Test: return 0
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/tests.py (1)
156-162
: Simplify NA test; avoid pd.NaType()pd.NaType isn’t part of the public API and this branch is redundant with the prior pd.NA check.
- assert ( - time_to_seconds(pd.NaType()) == 0 if hasattr(pd, "NaType") else True - ) # Skip if not available + # Already covered by the pd.NA assertion above; no extra check needed.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
snakemake_executor_plugin_slurm/efficiency_report.py
(1 hunks)tests/tests.py
(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-24T12:41:31.837Z
Learnt from: cmeesters
PR: snakemake/snakemake-executor-plugin-slurm#221
File: tests/tests.py:62-70
Timestamp: 2025-06-24T12:41:31.837Z
Learning: In the snakemake-executor-plugin-slurm efficiency report feature, the executor is responsible for creating the efficiency_report_path directory when generating efficiency reports. Tests should not manually check for directory existence but should rely on the executor to properly create necessary directories as part of its normal operation.
Applied to files:
tests/tests.py
🧬 Code graph analysis (1)
tests/tests.py (1)
snakemake_executor_plugin_slurm/efficiency_report.py (2)
parse_sacct_data
(115-221)time_to_seconds
(11-59)
🪛 Ruff (0.13.3)
snakemake_executor_plugin_slurm/efficiency_report.py
57-57: Consider moving this statement to an else
block
(TRY300)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: testing
🔇 Additional comments (3)
snakemake_executor_plugin_slurm/efficiency_report.py (1)
6-9
: Imports look gooddatetime and os are used; fine.
tests/tests.py (2)
12-15
: Good import additionsBrings time_to_seconds into scope; pandas used in tests.
66-176
: Excellent coverage for sacct time formatsCovers days, HH:MM:SS, MM:SS, seconds-only, fractional seconds, whitespace, invalid/NA inputs, and large-day edge cases. These assert the intended behavior and will pass with a parser that doesn’t rely on datetime limits.
Please re-run the suite after adopting the proposed parser to ensure all new cases pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/tests.py (1)
165-167
: Simplify the conditional logic.The conditional expression is unnecessarily complex.
pd.NaType()
creates an instance ofpd.NA
's type, which is redundant with the test on line 164 that already testspd.NA
.Consider simplifying to:
- assert ( - time_to_seconds(pd.NaType()) == 0 if hasattr(pd, "NaType") else True - ) # Skip if not available + # pd.NA is already tested above; NaType() creates same type instanceOr if you want to keep the test, make the logic clearer:
- assert ( - time_to_seconds(pd.NaType()) == 0 if hasattr(pd, "NaType") else True - ) # Skip if not available + if hasattr(pd, "NaType"): + assert time_to_seconds(pd.NaType()) == 0
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/tests.py
(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-24T12:41:31.837Z
Learnt from: cmeesters
PR: snakemake/snakemake-executor-plugin-slurm#221
File: tests/tests.py:62-70
Timestamp: 2025-06-24T12:41:31.837Z
Learning: In the snakemake-executor-plugin-slurm efficiency report feature, the executor is responsible for creating the efficiency_report_path directory when generating efficiency reports. Tests should not manually check for directory existence but should rely on the executor to properly create necessary directories as part of its normal operation.
Applied to files:
tests/tests.py
🧬 Code graph analysis (1)
tests/tests.py (1)
snakemake_executor_plugin_slurm/efficiency_report.py (2)
parse_sacct_data
(115-221)time_to_seconds
(11-59)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: testing
🔇 Additional comments (4)
tests/tests.py (4)
12-15
: LGTM!The new imports are well-organized and necessary for the comprehensive testing of the
time_to_seconds
function and pandas NA value handling.Also applies to: 19-19
66-181
: Excellent comprehensive test coverage.The
TestTimeToSeconds
class provides thorough coverage of SLURM sacct time formats including:
- Multiple day formats (single/double digit)
- Hours, minutes, seconds in various combinations
- Fractional seconds for TotalCPU format
- Edge cases (empty strings, None, pandas NA, invalid inputs)
- Whitespace handling
- Real-world examples from actual sacct output
The test structure is well-organized with clear method names and helpful docstrings.
175-176
: datetime.strptime supports non-zero-padded times
Verified that both "1:2:3" and "01:02:03" parse correctly with "%H:%M:%S", so no changes needed.
106-107
: Import error: tests cannot run due to missingsnakemake
module. Please installsnakemake
or adjustPYTHONPATH
to include the plugin’s package, then rerunpytest tests/tests.py::TestTimeToSeconds::test_totalcpu_format_with_days
to confirm whether the"1-30:45"
format is supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
snakemake_executor_plugin_slurm/efficiency_report.py (1)
35-46
: Remove duplicate format entry.Line 43 duplicates line 41: both are
"%M:%S.%f"
. Python'sstrptime
handles both zero-padded ("01:30.5") and non-padded ("1:30.5") inputs with the same format string, so the duplicate entry is unnecessary.Apply this diff to remove the redundant line:
"%M:%S.%f", # MM:SS.ffffff (with fractional seconds) "%M:%S", # MM:SS - "%M:%S.%f", # M:SS.ffffff (with fractional seconds) "%S.%f", # SS.ffffff (with fractional seconds)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
snakemake_executor_plugin_slurm/efficiency_report.py
(1 hunks)tests/tests.py
(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-24T12:41:31.837Z
Learnt from: cmeesters
PR: snakemake/snakemake-executor-plugin-slurm#221
File: tests/tests.py:62-70
Timestamp: 2025-06-24T12:41:31.837Z
Learning: In the snakemake-executor-plugin-slurm efficiency report feature, the executor is responsible for creating the efficiency_report_path directory when generating efficiency reports. Tests should not manually check for directory existence but should rely on the executor to properly create necessary directories as part of its normal operation.
Applied to files:
tests/tests.py
🧬 Code graph analysis (1)
tests/tests.py (1)
snakemake_executor_plugin_slurm/efficiency_report.py (2)
parse_sacct_data
(122-228)time_to_seconds
(11-66)
🪛 Ruff (0.13.3)
snakemake_executor_plugin_slurm/efficiency_report.py
63-63: Consider moving this statement to an else
block
(TRY300)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: testing
🔇 Additional comments (4)
tests/tests.py (2)
12-19
: LGTM! Imports support the new test suite.The additions of
time_to_seconds
andpandas
imports are appropriate for the comprehensive testing of SLURM time format parsing.
66-175
: Excellent test coverage for SLURM time parsing.The test suite comprehensively covers:
- Day formats (D-HH:MM:SS, DD-HH:MM:SS) for jobs longer than 1 day
- Fractional seconds for TotalCPU accuracy
- Various HH:MM:SS, MM:SS, and seconds-only formats
- Real-world sacct examples from actual output
- Edge cases (empty, NA, invalid, whitespace)
This thoroughly validates the fix for issue #361 and addresses the PR objectives.
snakemake_executor_plugin_slurm/efficiency_report.py (2)
6-8
: LGTM! Import additions support the new implementation.The
datetime
import enables robust time parsing withstrptime
, andos
supports the debug logging at line 260.
11-66
: Robust implementation correctly handles SLURM time formats.The datetime-based parsing approach successfully addresses the PR objectives:
- Handles jobs longer than 1 day (D-HH:MM:SS format)
- Accurately parses fractional seconds (TotalCPU)
- Falls back to 0 for invalid inputs
- Supports various format combinations (days, hours, minutes, seconds)
The implementation is appropriate for the HPC use case where jobs run ≤31 days (per maintainer's confirmation in past review). The comprehensive test suite in
tests/tests.py
validates correctness across all supported formats.Note: Ruff flags TRY300 (return in try block) as a style preference. This is informational only and was acknowledged in the past review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
snakemake_executor_plugin_slurm/efficiency_report.py
(1 hunks)tests/tests.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/tests.py
🧰 Additional context used
🪛 Ruff (0.13.3)
snakemake_executor_plugin_slurm/efficiency_report.py
64-64: Consider moving this statement to an else
block
(TRY300)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: testing
🔇 Additional comments (4)
snakemake_executor_plugin_slurm/efficiency_report.py (4)
6-6
: LGTM! Necessary imports for the updated parsing logic.The
datetime
import is required for the newstrptime
-based time parsing, and theos
import repositioning maintains clean organization.Also applies to: 8-8
11-24
: LGTM! Comprehensive documentation with clear examples.The expanded docstring effectively documents the SLURM time format variants and provides accurate examples that clarify the expected behavior.
25-30
: LGTM! Robust input validation.The validation correctly handles NA values, empty strings, and explicitly invalid strings, returning 0 as a safe default.
49-67
: LGTM! Correct parsing logic for SLURM time formats.The implementation correctly:
- Tries multiple formats in appropriate order
- Computes total seconds including fractional seconds
- Handles day-prefixed formats by adding
day * 86400
- Returns 0 as fallback for unparseable inputs
Note: Ruff flags TRY300 (return inside try) as a style preference, but this was previously discussed and acknowledged as acceptable.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
snakemake_executor_plugin_slurm/efficiency_report.py (1)
11-67
: Excellent implementation of SLURM time format parsing!The datetime-based approach correctly handles all documented SLURM sacct variants (days, hours, minutes, seconds, fractional seconds). Logic verified for the docstring examples:
- "1-12:30:45" → 131445s ✓
- "23:59:59" → 86399s ✓
- "45:30" → 2730s ✓
- "30.5" → 30.5s ✓
Input validation (NA, empty, "invalid") and fallback to 0 for unparseable inputs are appropriate.
Optional: Address Ruff TRY300 style suggestion.
The return statement at line 63 inside the
try
block can be moved to anelse
block for cleaner exception handling. This is a style preference, not a correctness issue:for fmt in time_formats: try: time_obj = datetime.strptime(time_str, fmt) + except ValueError: + continue + else: total_seconds = ( time_obj.hour * 3600 + time_obj.minute * 60 + time_obj.second + time_obj.microsecond / 1000000 ) - # Add days if present (datetime treats day 1 as the first day) + # Add days if present if fmt.startswith("%d-"): total_seconds += time_obj.day * 86400 return total_seconds - except ValueError: - continue return 0Minor clarity improvement: The comment "datetime treats day 1 as the first day" could be simplified to just "Add days if present" since the code correctly multiplies day value by 86400 regardless of datetime's internal month-day semantics.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake_executor_plugin_slurm/efficiency_report.py
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.13.3)
snakemake_executor_plugin_slurm/efficiency_report.py
63-63: Consider moving this statement to an else
block
(TRY300)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: testing
🔇 Additional comments (1)
snakemake_executor_plugin_slurm/efficiency_report.py (1)
6-8
: LGTM!The
datetime
import is correctly added for the enhanced time parsing, and the import reordering has no functional impact.
based on PR #360 and issue #361:
The current time conversion for the efficiency report does not consider long jobs, where
sacct
output may be in the D-HH:MM:SS format. Also, the conversion of fractional seconds was not taken into account. While the latter is actually ridiculously playing with accuracy, the HPC world sometimes is concerned about it.Included test cases.
This PR is based upon #360 and might replace it.
Summary by CodeRabbit
Bug Fixes
Tests