Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ensure FileBasedVariantLookup is used as a context manager #71

Merged
merged 10 commits into from
Nov 13, 2024

Conversation

clintval
Copy link
Member

@clintval clintval commented Oct 10, 2024

The FileBasedVariantLookup opens one or more pysam.VariantFile objects but provides no public API for closing them. The file-based lookup should have a public method for closing the file handles and should also support use as a context manager so there are safe ways to cleanup IO resources.

Closes: #27

Copy link

codecov bot commented Oct 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.65%. Comparing base (8cbc1a2) to head (b82e455).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #71      +/-   ##
==========================================
+ Coverage   96.63%   96.65%   +0.01%     
==========================================
  Files          26       26              
  Lines        1696     1706      +10     
  Branches      330      331       +1     
==========================================
+ Hits         1639     1649      +10     
  Misses         31       31              
  Partials       26       26              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

prymer/api/variant_lookup.py Outdated Show resolved Hide resolved
tests/api/test_variant_lookup.py Outdated Show resolved Hide resolved
@@ -35,7 +35,6 @@ jobs:
- name: Set up miniconda
uses: conda-incubator/setup-miniconda@v3
with:
miniforge-variant: Mambaforge
Copy link
Member Author

@clintval clintval Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary now, since support has dropped fully:

Also implemented in:

@clintval
Copy link
Member Author

@coderabbitai full review

Copy link

coderabbitai bot commented Oct 15, 2024

✅ Actions performed

Full review triggered.

Copy link

coderabbitai bot commented Oct 15, 2024

Walkthrough

The pull request introduces enhancements to the VariantLookup module by modifying the FileBasedVariantLookup class to implement the ContextManager interface. This allows instances to be used within a with statement, ensuring proper closure of VCF file handles. The __exit__ method is added for cleanup, and a close method is introduced for explicitly closing file handles. Additionally, the docstrings for FileBasedVariantLookup and the disk_based function are updated to include usage examples with context managers. Import statements are also adjusted to include necessary types for the new functionality. The test suite is updated to utilize context managers for resource management in tests related to variant lookups, enhancing clarity and maintainability. Minor formatting changes are made in the test files, particularly in error handling sections, without altering the underlying logic or assertions. Overall, the changes focus on improving resource management and usability in both the implementation and testing of the VariantLookup module.

Assessment against linked issues

Objective Addressed Explanation
Update GitHub Action tests to remove deprecated Mambaforge (#27) The pull request does not address the issue related to GitHub Action tests.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (2)
tests/api/test_variant_lookup.py (2)

449-457: LGTM! Consider using a context manager for consistency.

This test function effectively checks the logging behavior of VariantOverlapDetector. The test correctly verifies that the expected debug message is logged when querying a non-existent region.

For consistency with the FileBasedVariantLookup test, consider using a context manager for VariantOverlapDetector if it supports it. This would ensure proper resource management and maintain a consistent style across similar tests.


Line range hint 480-507: LGTM! Good test design with randomization and proper resource management.

This test function effectively checks the logging behavior when a chromosome is missing from one of the VCFs. The use of a context manager for FileBasedVariantLookup ensures proper resource management. The shuffling of VCF paths is an excellent practice to ensure the test is not order-dependent and is more robust.

Regarding the existing comment about moving assertions outside the context manager:

with FileBasedVariantLookup(
    vcf_paths=vcf_paths, min_maf=0.00, include_missing_mafs=True
) as variant_lookup:
    variants_of_interest = variant_lookup.query(
        refname="chr2", start=7999, end=9900
    )
assert len(variants_of_interest) == 14
expected_error_msg = "does not contain chromosome"
assert expected_error_msg in caplog.text

The suggestion to move the assertions outside the context manager is valid and can be implemented. This change would not affect the test's functionality and might slightly improve readability. However, the current implementation is also correct and doesn't pose any issues.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 93ea6e7 and ae2ee23.

📒 Files selected for processing (3)
  • .github/workflows/tests.yml (0 hunks)
  • prymer/api/variant_lookup.py (4 hunks)
  • tests/api/test_variant_lookup.py (5 hunks)
💤 Files with no reviewable changes (1)
  • .github/workflows/tests.yml
🧰 Additional context used
🔇 Additional comments (3)
tests/api/test_variant_lookup.py (3)

437-447: LGTM! Good use of context manager and logging capture.

This test function effectively checks the logging behavior of FileBasedVariantLookup. The use of a context manager for FileBasedVariantLookup ensures proper resource management, which is a good practice. The test correctly verifies that the expected debug message is logged when querying a non-existent region.


Line range hint 460-467: LGTM! Good error handling and resource management.

This test function effectively checks the error handling for missing index files in both disk_based and cached methods. The use of pytest.raises to check for expected exceptions is appropriate. The closing context manager for disk_based ensures proper resource management, which is a good practice.


470-477: LGTM! Consistent error handling and resource management.

This test function effectively checks the error handling for empty VCF paths in both disk_based and cached methods. The use of pytest.raises to check for expected exceptions is appropriate. The closing context manager for disk_based ensures proper resource management, maintaining consistency with the previous test.

prymer/api/variant_lookup.py Outdated Show resolved Hide resolved
prymer/api/variant_lookup.py Outdated Show resolved Hide resolved
prymer/api/variant_lookup.py Outdated Show resolved Hide resolved
prymer/api/variant_lookup.py Outdated Show resolved Hide resolved
prymer/api/variant_lookup.py Outdated Show resolved Hide resolved
@tfenne tfenne assigned clintval and unassigned tfenne Oct 16, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
prymer/api/variant_lookup.py (1)

325-328: Consider adding a comment for the empty close() method

The close() method is currently empty. If this is intentional (e.g., to be overridden by subclasses), consider adding a comment to explain this design choice.

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 327-327: prymer/api/variant_lookup.py#L327
Added line #L327 was not covered by tests

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between ae2ee23 and 3e6ad5a.

📒 Files selected for processing (2)
  • prymer/api/variant_lookup.py (6 hunks)
  • tests/api/test_variant_lookup.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/api/test_variant_lookup.py
🧰 Additional context used
🪛 GitHub Check: codecov/patch
prymer/api/variant_lookup.py

[warning] 327-327: prymer/api/variant_lookup.py#L327
Added line #L327 was not covered by tests

🔇 Additional comments (5)
prymer/api/variant_lookup.py (5)

62-62: LGTM: Import changes look good

The new imports (AbstractContextManager, TracebackType, and override) are appropriate for implementing context management and type checking. These additions align well with the changes made in the classes.

Also applies to: 68-68, 79-79


239-239: LGTM: Context management added to VariantLookup

The addition of AbstractContextManager to the base class is a good design choice, allowing all subclasses to be used as context managers.


342-354: LGTM: Updated docstring with context management example

The updated docstring with an example using context management is helpful and clearly demonstrates the new feature.


480-491: LGTM: Updated docstring with context management example

The updated docstring for the disk_based function includes a clear example of using the FileBasedVariantLookup as a context manager, which is consistent with the changes made to the class.


384-389: LGTM: Proper implementation of close() method

The close() method correctly closes all file handles, ensuring proper resource management.

To ensure test coverage for the new close() method, run the following script:

✅ Verification successful

Verified: close() method is properly tested via context managers

  • Tests in tests/api/test_variant_lookup.py utilize FileBasedVariantLookup within with statements, ensuring that close() is invoked correctly.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check test coverage for FileBasedVariantLookup.close() method

# Test: Search for test cases covering the close() method
rg --type python -g 'test_*.py' 'def test.*close.*FileBasedVariantLookup'

Length of output: 106


Script:

#!/bin/bash
# Description: Check test coverage for FileBasedVariantLookup.close() method

# Test: Search for test cases covering the close() method
rg -t py -g 'test_*.py' 'def test.*close.*FileBasedVariantLookup'

Length of output: 67


Script:

#!/bin/bash
# Description: Check for usage of FileBasedVariantLookup.close() in test files

# Test: Search for 'FileBasedVariantLookup.close()' usage in test_*.py files
rg -t py -g 'test_*.py' 'FileBasedVariantLookup\.close\('

Length of output: 59


Script:

#!/bin/bash
# Description: Search for any usage of FileBasedVariantLookup in test files

# Test: Find all instances of FileBasedVariantLookup in test_*.py files
rg -t py -g 'test_*.py' 'FileBasedVariantLookup'

Length of output: 456

prymer/api/variant_lookup.py Outdated Show resolved Hide resolved
@clintval clintval requested a review from tfenne October 16, 2024 17:16
@clintval clintval assigned tfenne and unassigned clintval Oct 16, 2024
@msto msto removed their assignment Oct 21, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)
tests/api/test_variant_lookup.py (2)

448-455: Consider consolidating duplicate test logic.

Both test functions share similar setup and assertions. Consider using a parameterized test to reduce code duplication.

-def test_simple_variant_conversion_logs_file_based(
-    vcf_path: Path, caplog: pytest.LogCaptureFixture
-) -> None:
-    caplog.set_level(logging.DEBUG)
-    with FileBasedVariantLookup(
-        vcf_paths=[vcf_path], min_maf=0.01, include_missing_mafs=False
-    ) as variant_lookup:
-        variant_lookup.query(refname="foo", start=1, end=2)
-        assert "No variants extracted from region of interest" in caplog.text
-
-def test_simple_variant_conversion_logs_non_file_based(
-    vcf_path: Path, caplog: pytest.LogCaptureFixture
-) -> None:
-    caplog.set_level(logging.DEBUG)
-    variant_lookup = VariantOverlapDetector(
-        vcf_paths=[vcf_path], min_maf=0.01, include_missing_mafs=False
-    )
-    variant_lookup.query(refname="foo", start=1, end=2)
-    assert "No variants extracted from region of interest" in caplog.text
+@pytest.mark.parametrize(
+    "lookup_class,context_manager",
+    [
+        (FileBasedVariantLookup, True),
+        (VariantOverlapDetector, False),
+    ],
+)
+def test_simple_variant_conversion_logs(
+    vcf_path: Path,
+    caplog: pytest.LogCaptureFixture,
+    lookup_class: Type,
+    context_manager: bool,
+) -> None:
+    caplog.set_level(logging.DEBUG)
+    kwargs = {"vcf_paths": [vcf_path], "min_maf": 0.01, "include_missing_mafs": False}
+    
+    if context_manager:
+        with lookup_class(**kwargs) as variant_lookup:
+            variant_lookup.query(refname="foo", start=1, end=2)
+    else:
+        variant_lookup = lookup_class(**kwargs)
+        variant_lookup.query(refname="foo", start=1, end=2)
+    
+    assert "No variants extracted from region of interest" in caplog.text

601-613: Improve code comment clarity.

The inline comment about "common-mixed-2/2" could be clearer about the VCF format specifics.

-                end=9100,  # while "common-mixed-2/2" starts at 9101, in the VCf is starts at 9100
+                end=9100,  # Note: In VCF format, "common-mixed-2/2" is part of a multi-allelic variant at position 9100
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 909a4b5 and bd8313c.

📒 Files selected for processing (3)
  • prymer/api/variant_lookup.py (4 hunks)
  • tests/api/test_picking.py (1 hunks)
  • tests/api/test_variant_lookup.py (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/api/test_picking.py
🔇 Additional comments (6)
prymer/api/variant_lookup.py (3)

67-68: LGTM: Import statements are appropriate.

The added imports for TracebackType and ContextManager are necessary for implementing the context manager protocol.


368-371: LGTM: Resource cleanup is properly implemented.

The close() method correctly closes all underlying VCF file handles.


330-338: LGTM: Documentation is clear and comprehensive.

The updated docstrings provide clear examples of using the context manager functionality, which will help users understand the proper usage pattern.

Also applies to: 474-485

tests/api/test_variant_lookup.py (3)

436-446: LGTM: Clean test split for file-based variant lookup.

The test function properly uses the context manager pattern and has clear assertions.


463-464: LGTM: Proper context manager usage in error tests.

The error handling tests correctly use context managers and maintain the same assertions.

Also applies to: 472-473


494-499: Consider moving assertions outside context manager.

As suggested in the past review comments, consider moving assertions outside the context manager scope for better readability and to follow the test's Arrange-Act-Assert pattern.

prymer/api/variant_lookup.py Show resolved Hide resolved
prymer/api/variant_lookup.py Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
prymer/api/variant_lookup.py (1)

364-368: Remove redundant return None statement in __exit__ method

Explicitly returning None at the end of the __exit__ method is unnecessary because Python functions return None by default if no return statement is present.

Apply this diff to simplify the method:

 def __exit__(
     self,
     exc_type: Optional[type[BaseException]],
     exc_value: Optional[BaseException],
     traceback: Optional[TracebackType],
 ) -> None:
     """Exit this context manager while closing the underlying VCF handles."""
     self.close()
-    return None
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between bd8313c and b82e455.

📒 Files selected for processing (1)
  • prymer/api/variant_lookup.py (5 hunks)
🔇 Additional comments (5)
prymer/api/variant_lookup.py (5)

67-68: Imports added support context management functionality

The addition of TracebackType and ContextManager imports is appropriate and necessary for implementing the context management protocol in FileBasedVariantLookup.


325-325: Class inheritance is appropriate for context management

Inheriting from ContextManager clearly indicates that FileBasedVariantLookup implements the context management protocol, enhancing readability and type hinting.


356-369: Context manager methods are correctly implemented

The __enter__ and __exit__ methods are properly defined, allowing FileBasedVariantLookup instances to be used with the with statement for safer resource management.


382-385: close method correctly releases resources

The close method properly iterates over all VCF file handles and closes them, ensuring that file descriptors are released and preventing potential resource leaks.


477-488: Docstring example is updated for context management

The example in the disk_based function's docstring demonstrates the correct usage of the context manager, which enhances clarity for users on how to safely handle resources.

Copy link
Member

@tfenne tfenne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a step in the right direction. I still think we need to refactor at some point so that if you have a VariantLookup you can call close() on it.

@clintval clintval merged commit 7e2e784 into main Nov 13, 2024
7 checks passed
@clintval clintval deleted the fix/vcf_context_managers branch November 13, 2024 23:39
@msto
Copy link
Collaborator

msto commented Nov 14, 2024

@tfenne so we don't lose track of it, I've mentioned your last comment in a new issue related to resolving some of the discussion that occurred on this PR.

#88

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GitHub Action tests are using deprecated Mambaforge
3 participants