Integrate tests into docker workflows #366

rhanka · 2025-05-19T20:14:49Z

Summary

run tests via a new Makefile tests target
execute make tests in push and pull request workflows
document test commands in README
remove standalone pytest workflow

Testing

pytest -q (fails: command not found)
make tests (fails: docker-compose not found)

Summary by CodeRabbit

New Features
- Added automated test execution to the development workflow.
- Introduced a new Makefile command to run tests using Docker.
- Added a comprehensive test suite to validate core functionality.
- Included test files and directories in Docker and development container setups.
Documentation
- Updated the README with instructions for running tests.
Chores
- Included pytest as a project dependency for testing.

coderabbitai · 2025-05-19T20:14:55Z

Warning

Rate limit exceeded

@rhanka has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 8 minutes and 58 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 207b2a5 and 1706c7f.

📒 Files selected for processing (2)

Makefile (2 hunks)
tests/test_recipes.py (1 hunks)

Walkthrough

A dedicated testing phase was introduced into the project's workflow and documentation. This includes adding a tests Makefile target, updating GitHub Actions workflows to run tests after building, documenting test execution in the README, adding pytest as a dependency, and introducing a new test suite for the recipes.py module. The Docker setup and development environment were updated to include the tests directory.

Changes

File(s)	Change Summary
.github/workflows/pull.yml .github/workflows/push.yml	Added a "tests" step to GitHub Actions workflows, running `make tests` after build; moved backend-stop command to the tests step.
Makefile	Added a phony `tests` target that runs `pytest` quietly inside the backend Docker container; exported `TESTS` environment variable.
README.md	Added "Running tests" section explaining how to run tests via Makefile or manually with pip and pytest.
requirements.txt	Added `pytest` as a new dependency.
tests/test_recipes.py	New test module for `recipes.py` testing fixed-width formatting and internal methods of `Recipe`.
Dockerfile	Replaced `matchID_test/` with `tests/` directory; copied `tests/` into image; added `/tests` volume mount in production and dev stages.
docker-compose-dev.yml	Added volume mount for `${BACKEND}/tests/` into container at `${APP_GROUP}/tests/`.

Sequence Diagram(s)

sequenceDiagram
    participant Developer
    participant GitHub Actions
    participant Makefile
    participant Docker Backend
    participant Pytest

    Developer->>GitHub Actions: Open PR or push code
    GitHub Actions->>Makefile: Run build
    GitHub Actions->>Makefile: Run tests (make tests)
    Makefile->>Docker Backend: Start container with latest backend image
    Docker Backend->>Pytest: Run pytest in quiet mode
    Pytest-->>Docker Backend: Report test results
    Docker Backend-->>Makefile: Return test status
    Makefile-->>GitHub Actions: Return test status
    GitHub Actions-->>Developer: Show build and test results

Poem

In the garden of code, a new path appears,
With tests that now run to quiet our fears.
Pytest in the burrow, Makefile on guard,
Workflows now check if our recipes are marred.
🐇 Hopping with joy, the rabbit proclaims—
"Our code is much stronger, and so are our games!"

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

rhanka · 2025-05-19T20:16:41Z

🎉 Snyk checks have passed. No issues have been found so far.

✅ security/snyk check is complete. No issues have been found. (View Details)

coderabbitai

Actionable comments posted: 1

🔭 Outside diff range comments (1)

tests/test_recipes.py (1)

33-46: 🛠️ Refactor suggestion

Consider creating separate tests for each method and test more edge cases.

The test for internal Recipe methods combines testing of internal_fillna and internal_keep in a single test function. Additionally, the test is tightly coupled to the Recipe class implementation.

I recommend:

Split into separate test functions for each method
Test more edge cases (all NA values, empty DataFrames)
Add more descriptive assertions for better failure diagnostics

-def test_internal_fillna_and_keep():
+def test_internal_fillna():
    df = pd.DataFrame({'A': [1, None], 'B': [None, 'x']})
    r_fill = recipes.Recipe.__new__(recipes.Recipe)
    r_fill.args = [{'A': 0, 'B': ''}]
    filled = recipes.Recipe.internal_fillna(r_fill, df.copy())
    assert filled['A'].tolist() == [1, 0]
    assert filled['B'].tolist() == ['', 'x']
+
+    # Test edge case: all NA values
+    df_all_na = pd.DataFrame({'A': [None, None], 'B': [None, None]})
+    filled_all_na = recipes.Recipe.internal_fillna(r_fill, df_all_na.copy()) 
+    assert filled_all_na['A'].tolist() == [0, 0]
+    assert filled_all_na['B'].tolist() == ['', '']
+
+def test_internal_keep():
+    df = pd.DataFrame({'A': [1, None], 'B': [None, 'x'], 'C': [3, 4]})
    r_keep = recipes.Recipe.__new__(recipes.Recipe)
    r_keep.args = {'select': ['A']}
    recipes.Recipe.select_columns(r_keep, df.copy())
    kept = recipes.Recipe.internal_keep(r_keep, df.copy())
    assert list(kept.columns) == ['A']
+    
+    # Test selecting multiple columns
+    r_keep_multi = recipes.Recipe.__new__(recipes.Recipe)
+    r_keep_multi.args = {'select': ['A', 'C']}
+    recipes.Recipe.select_columns(r_keep_multi, df.copy())
+    kept_multi = recipes.Recipe.internal_keep(r_keep_multi, df.copy())
+    assert list(kept_multi.columns) == ['A', 'C']

🧹 Nitpick comments (4)

requirements.txt (1)
28-29: Pin pytest version to avoid unexpected breaks
Adding pytest without any version constraint can lead to future breakages if a major release introduces breaking changes. Consider specifying a compatible version range, for example:
-pyarrow>=14.0.1
-pytest
+pyarrow>=14.0.1
+pytest>=7.0,<8.0
This will ensure reproducible test runs across environments.
tests/test_recipes.py (3)
10-19: Consider adding more specific error handling.

While the dynamic import mechanism works well, the error handling could be more specific to provide clearer feedback about why tests are skipped.
try:
    spec = importlib.util.spec_from_file_location('recipes', MODULE_PATH)
    recipes = importlib.util.module_from_spec(spec)
    spec.loader.exec_module(recipes)
    return recipes
except FileNotFoundError:
-    pytest.skip(f'Could not import recipes module: {e}')
+    pytest.skip(f'Recipes module file not found at {MODULE_PATH}')
except ImportError as e:
+    pytest.skip(f'Import error in recipes module: {e}')
except Exception as e:
+    pytest.skip(f'Could not import recipes module: {e}')
22-31: Consider expanding fixed-width format test coverage.

The test for fixed-width formatting is a good start but only verifies the first line of output. Consider:

Verifying all lines in the output file to ensure multi-row writing works correctly

Testing edge cases like empty DataFrames or very wide columns

Testing with different column types
def test_fwf_format_and_to_fwf(tmp_path):
    df = pd.DataFrame({'A': ['a', 'bb'], 'B': ['1', '22']})
    widths = [3, 3]
    line = recipes.fwf_format(df.iloc[0], widths)
    assert line == 'a  1  '
    outfile = tmp_path / 'out.txt'
    recipes.to_fwf(df, outfile, widths=widths, names=['A', 'B'])
    content = outfile.read_text().splitlines()
    assert content[0].strip() == line.strip()
+    # Verify second line formatting
+    line2 = recipes.fwf_format(df.iloc[1], widths)
+    assert content[1].strip() == line2.strip()
+    
+    # Test with empty DataFrame
+    empty_df = pd.DataFrame(columns=['A', 'B'])
+    empty_outfile = tmp_path / 'empty_out.txt'
+    recipes.to_fwf(empty_df, empty_outfile, widths=widths, names=['A', 'B'])
+    assert empty_outfile.exists()
+    assert empty_outfile.stat().st_size == 0
8-8: Consider making the module path more flexible.

The hardcoded module path might make tests brittle if the project structure changes.
-MODULE_PATH = Path(__file__).resolve().parents[1] / 'code' / 'recipes.py'
+# More flexible approach to find the module
+def find_module_path(module_name='recipes.py', search_dirs=['code', '.']):
+    base_dir = Path(__file__).resolve().parents[1]
+    for dir_name in search_dirs:
+        path = base_dir / dir_name / module_name
+        if path.exists():
+            return path
+    return base_dir / 'code' / module_name  # Default fallback
+
+MODULE_PATH = find_module_path()

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 5a7866b and 3fb0c8a.

📒 Files selected for processing (6)

.github/workflows/pull.yml (1 hunks)
.github/workflows/push.yml (1 hunks)
Makefile (1 hunks)
README.md (1 hunks)
requirements.txt (1 hunks)
tests/test_recipes.py (1 hunks)

🧰 Additional context used

🪛 GitHub Actions: pull request

Makefile

[error] 701-701: Makefile target 'tests' failed with error code 5. No tests were run by pytest.

🔇 Additional comments (4)

.github/workflows/pull.yml (1)

20-25: Tests step integration looks correct
The new “tests” step runs make tests under the same branch context after the build. This correctly hooks into the existing flow. Once the Makefile target is updated to mount tests, this workflow will execute as expected.

README.md (1)

132-141: Documentation for running tests is clear
The added section accurately describes both the Makefile-based and manual methods to run the test suite. This aligns with the new workflows and Makefile target. Consider adding a badge for test status in the future.

.github/workflows/push.yml (1)

24-28: Tests step integration looks correct
The “tests” step is properly inserted between build and publish, invoking make tests with the correct branch context. No modifications needed here once the Makefile is fixed.

tests/test_recipes.py (1)

1-9: Good use of dynamic imports and conditional test skipping.

The approach of dynamically importing the recipes module and skipping tests when dependencies aren't available is robust. This ensures tests fail gracefully in environments where prerequisites aren't met.

I particularly like the use of pytest.importorskip for pandas and numpy dependencies.

coderabbitai · 2025-05-19T20:20:50Z

Makefile

+tests: backend-build
+	docker run --rm ${DOCKER_USERNAME}/${DC_PREFIX}-${APP}:latest pytest -q
+
+.PHONY: tests


🛠️ Refactor suggestion

⚠️ Potential issue

Critical: Ensure tests are mounted into the container
The tests target currently invokes pytest inside the built image, but since test files aren’t baked into that image, pytest finds no tests (exit code 5). You need to mount your tests/ folder (and any other source dirs, if required) into the container at runtime.

Apply a diff like this to include the tests directory and set the working directory:

-tests: backend-build - docker run --rm ${DOCKER_USERNAME}/${DC_PREFIX}-${APP}:latest pytest -q -.PHONY: tests +tests: backend-build + docker run --rm \ + -v $(PWD)/tests:/app/tests \ + -w /app \ + ${DOCKER_USERNAME}/${DC_PREFIX}-${APP}:latest \ + pytest -q +.PHONY: tests

This mounts tests/ into /app/tests (adjust /app to your image’s workdir) so that pytest can discover and run them.

🧰 Tools

🪛 GitHub Actions: pull request

[error] 701-701: Makefile target 'tests' failed with error code 5. No tests were run by pytest.

🤖 Prompt for AI Agents

In the Makefile around lines 700 to 703, the tests target runs pytest inside the Docker container but does not mount the tests directory, so pytest cannot find any tests. Modify the docker run command to mount the local tests directory into the container at the appropriate path (e.g., /app/tests) using the -v option and set the working directory inside the container to that path with -w, ensuring pytest can discover and execute the tests.

t'as pas bien lu le Dockerfile, les tests sont bien montés dedans

@rhanka You're right, and I apologize for the misunderstanding. After reviewing the AI summary of the Dockerfile changes, I see that the tests directory is indeed:

Already copied into the production image alongside the other directories (code/, conf/, referential_data/)

Set up with a volume mount for /tests in both production and development stages

This makes my previous suggestion unnecessary since the tests are already available in the container. Thank you for the correction.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

.github/workflows/pull.yml (1)
17-19: Refactor build step for readability and maintainability
The current one-liner with an inline fallback is difficult to read and update. Consider switching to a multiline run block and leveraging the env-declared GIT_BRANCH so you don’t have to pass it on every make invocation.
-        run: make version backend-docker-check GIT_BRANCH=$GIT_BRANCH || ( make backend-build GIT_BRANCH=$GIT_BRANCH && make backend GIT_BRANCH=$GIT_BRANCH)
+        run: |
+          make version backend-docker-check || \
+            (make backend-build && make backend)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 3fb0c8a and 940c465.

📒 Files selected for processing (2)

.github/workflows/pull.yml (1 hunks)
Makefile (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

Makefile

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: pull request test

coderabbitai · 2025-05-19T20:37:20Z

.github/workflows/pull.yml

+      - name: tests
+        if: success()
+        run: make tests backend-stop GIT_BRANCH=$GIT_BRANCH


🛠️ Refactor suggestion

Separate test execution from cleanup to ensure teardown on failures
By chaining make tests backend-stop, the backend will only be stopped if tests pass—failing tests will leave containers running. It’s more robust to split into two steps: one for tests and another (with if: always()) for cleanup.

@@ -20,5 +20,12 @@ env: GIT_BRANCH: ${{ steps.extract_branch.outputs.branch }} - - name: tests - if: success() - run: make tests backend-stop GIT_BRANCH=$GIT_BRANCH + - name: Run tests + run: make tests GIT_BRANCH=$GIT_BRANCH + env: + GIT_BRANCH: ${{ steps.extract_branch.outputs.branch }} + + - name: Teardown backend + if: always() + run: make backend-stop GIT_BRANCH=$GIT_BRANCH + env: + GIT_BRANCH: ${{ steps.extract_branch.outputs.branch }}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: tests

if: success()

run: make tests backend-stop GIT_BRANCH=$GIT_BRANCH

env:

GIT_BRANCH: ${{ steps.extract_branch.outputs.branch }}

- name: Run tests

run: make tests GIT_BRANCH=$GIT_BRANCH

env:

GIT_BRANCH: ${{ steps.extract_branch.outputs.branch }}

- name: Teardown backend

if: always()

run: make backend-stop GIT_BRANCH=$GIT_BRANCH

env:

GIT_BRANCH: ${{ steps.extract_branch.outputs.branch }}

🤖 Prompt for AI Agents

In .github/workflows/pull.yml around lines 20 to 22, the current step runs tests and backend-stop together, causing backend-stop to run only if tests succeed. To fix this, split this into two separate steps: one step to run tests without backend-stop, and a following step to run backend-stop with the condition `if: always()` to ensure cleanup runs regardless of test success or failure.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

docker-compose-dev.yml (1)

11-11: Mount tests directory for backend service
This line exposes your test suite to the container so you can run pytest inside the dev environment. Ensure that both BACKEND and APP_GROUP are defined (e.g., in a .env file) and that the container path (/${APP_GROUP}/tests/) matches the working directory used in your Dockerfile. Consider consolidating on a single variable name (such as APP) across Dockerfile and Compose to avoid path mismatches.

Dockerfile (1)

73-73: Align volume path variable naming
In the development stage you correctly use ${APP} for the tests volume, but the production stage uses ${app_path}. For consistency and to prevent future confusion, unify both to ${APP}.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 940c465 and 97985ec.

📒 Files selected for processing (8)

.github/workflows/pull.yml (1 hunks)
.github/workflows/push.yml (1 hunks)
Dockerfile (3 hunks)
Makefile (2 hunks)
README.md (1 hunks)
docker-compose-dev.yml (1 hunks)
requirements.txt (1 hunks)
tests/test_recipes.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (6)

requirements.txt
README.md
.github/workflows/push.yml
.github/workflows/pull.yml
Makefile
tests/test_recipes.py

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: pull request test

🔇 Additional comments (2)

Dockerfile (2)

26-26: Create tests directory in build image
Adding tests\ to the mkdir -p ensures that the directory exists before copying or mounting it later in the build.

44-44: Include tests in production image
This COPY tests/ tests/ line brings your test suite into the container. Make sure the tests/ directory is present in the build context and update your .dockerignore if you need to exclude other files.

coderabbitai · 2025-05-20T00:52:15Z

Dockerfile

 VOLUME /${app_path}/log
 VOLUME /${app_path}/models
 VOLUME /${app_path}/upload
+VOLUME /${app_path}/tests


⚠️ Potential issue

Undefined environment variable used for volume path
The directive uses ${app_path} which isn’t declared (you only have ARG APP / ENV APP). This will cause a build error. Update it to use ${APP}, or define app_path earlier.

- VOLUME /${app_path}/tests + VOLUME /${APP}/tests

🤖 Prompt for AI Agents

In Dockerfile at line 50, the VOLUME directive uses an undefined environment variable `${app_path}`, which causes a build error. Replace `${app_path}` with the correctly defined environment variable `${APP}` to match the existing ENV declaration, or alternatively define `app_path` earlier in the Dockerfile before this line.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

tests/test_recipes.py (4)
5-5: Remove unused os import

The os module isn’t referenced anywhere in this test file. Keeping it will keep Ruff/F401 failing and slightly slow import time.
-import os
🧰 Tools

🪛 Ruff (0.11.9)

5-5: os imported but unused

Remove unused import: os

(F401)

48-61: Use the helper that stubs a logger to avoid AttributeError risk

Here (and in several following tests) the Recipe instance is created with __new__ but without a log attribute.
Many internal methods call self.log.write in their exception paths; if any branch triggers, the test will raise an unexpected AttributeError, masking the real problem.

Prefer the _recipe_with_args helper (lines 88-93) for all tests:
-r_fill = recipes.Recipe.__new__(recipes.Recipe)
-r_fill.args = [{'A': 0, 'B': ''}]
+r_fill = _recipe_with_args([{'A': 0, 'B': ''}])
Apply the same pattern to the other tests that currently bypass the helper.

80-86: Strengthen the shuffle test – current assertion allows no-shuffle pass

internal_shuffle is supposed to randomise row order, yet the test only checks that the sets of values are identical – that will succeed even if the function returns the input unchanged.

Add an assertion that the order is likely different (with a tolerance for rare collisions):
orig_order = df['A'].tolist()
shuf_order = shuffled['A'].tolist()
assert orig_order != shuf_order or np.random.seed()  # very low probability of equality
180-191: Consolidate duplicate coverage of internal_keep

internal_keep is already partially exercised in test_internal_fillna_and_keep.
Keeping another almost-identical test increases runtime and maintenance with little extra value. Either:

Merge the additional “where” filter assertions into the first test, or

Rename this one to emphasise it covers the where clause specifically.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 97985ec and f45af24.

📒 Files selected for processing (1)

tests/test_recipes.py (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

tests/test_recipes.py (1)

code/recipes.py (24)

fwf_format (77-78)

to_fwf (81-108)

Recipe (976-2486)

internal_fillna (1622-1634)

select_columns (1596-1608)

internal_keep (1900-1914)

internal_rename (1697-1700)

internal_map (1702-1715)

internal_shuffle (1717-1726)

internal_to_integer (1916-1927)

internal_to_float (1955-1970)

internal_parsedate (2448-2459)

internal_normalize (2476-2483)

internal_pause (2485-2486)

internal_list_to_tuple (1929-1940)

internal_tuple_to_list (1942-1953)

internal_sql (2064-2070)

internal_unnest (2378-2394)

internal_nest (2396-2409)

internal_delete (2072-2085)

internal_groupby (2087-2118)

internal_ngram (1972-1987)

internal_exec (1636-1642)

internal_eval (1644-1695)

🪛 Ruff (0.11.9)

tests/test_recipes.py

5-5: os imported but unused

Remove unused import: os

(F401)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: pull request test

🔇 Additional comments (2)

tests/test_recipes.py (2)

30-31: config module is imported but never used – verify necessity

config = load_module('config', config_path) is executed but the config variable is never referenced afterwards.
If the sole purpose is to ensure recipes.py can import config, that already happens when recipes is loaded (because CODE_DIR is on sys.path). Consider removing the explicit load to avoid dead code, or add a comment explaining why it is required.

200-205: Regex replacement test is good – nice edge-case coverage

The test validates both selection and regex replacement behaviour and should catch most regressions. No issues spotted.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (5)

tests/test_recipes.py (5)
5-5: Remove unused os import to satisfy Ruff F401 and keep the module clean

The os module is imported but never referenced in the test file.
Eliminating it avoids superfluous dependencies and silences the static-analysis warning.
-import os
🧰 Tools

🪛 Ruff (0.11.9)

5-5: os imported but unused

Remove unused import: os

(F401)

16-27: Restore sys.path in-place to avoid clobbering concurrent modifications

load_module() saves a copy of sys.path and later reassigns
sys.path = original_sys_path.
If any other thread (or import hook) mutates sys.path while the module
is loading, that whole list will be discarded.
Restoring the list in-place (sys.path[:] = …) preserves late additions
and is the customary, safer pattern.
-    finally:
-        sys.path = original_sys_path
+    finally:
+        # Restore in place to keep any entries appended by other threads
+        sys.path[:] = original_sys_path
56-60: Redundant call to select_columns – can be removed for brevity

internal_keep() already invokes select_columns().
Calling it on lines 58-59 produces no side-effects that are used later and
can be dropped to shorten the test.
-    recipes.Recipe.select_columns(r_keep, filled)
-    kept = recipes.Recipe.internal_keep(r_keep, filled)
+    kept = recipes.Recipe.internal_keep(r_keep, filled)
92-124: Avoid constructing mutable objects in default-argument position (Ruff B008)

pd.DataFrame(...) and range(...) are executed at import time inside the
lambda’s default value. This is harmless in tests, yet the pattern is
flagged and can surprise maintainers.

A minimal tweak keeps the lambda pure and silences the warning:
-            lambda res, df_orig=pd.DataFrame({'A': range(10), 'B': list('abcdefghij')}): (
+            lambda res, df_orig=_DF_ORIG: (
Add a constant right above the parametrize block:
_DF_ORIG = pd.DataFrame({'A': range(10), 'B': list('abcdefghij')})
🧰 Tools

🪛 Ruff (0.11.9)

119-119: Do not perform function call pd.DataFrame in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

119-119: Do not perform function call range in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

84-90: Minor: replace ad-hoc logger stub with logging.NullHandler for clarity

Creating a dynamic class with a write lambda works, but Python’s
logging module already provides a no-op sink (logging.NullHandler())
which is clearer to readers:
-import logging
-...
-    r.log = type('log', (), {'write': lambda self, *a, **k: None})()
+import logging
+...
+    r.log = logging.getLogger("null")
+    r.log.addHandler(logging.NullHandler())

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between f45af24 and a08ed5c.

📒 Files selected for processing (1)

tests/test_recipes.py (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

tests/test_recipes.py (1)

code/recipes.py (25)

fwf_format (77-78)

to_fwf (81-108)

Recipe (976-2486)

internal_fillna (1622-1634)

select_columns (1596-1608)

internal_keep (1900-1914)

internal_rename (1697-1700)

internal_map (1702-1715)

internal_to_integer (1916-1927)

internal_to_float (1955-1970)

internal_shuffle (1717-1726)

internal_parsedate (2448-2459)

internal_normalize (2476-2483)

internal_pause (2485-2486)

internal_list_to_tuple (1929-1940)

internal_tuple_to_list (1942-1953)

internal_sql (2064-2070)

internal_unnest (2378-2394)

internal_nest (2396-2409)

internal_delete (2072-2085)

internal_replace (2461-2474)

internal_groupby (2087-2118)

internal_ngram (1972-1987)

internal_exec (1636-1642)

internal_eval (1644-1695)

🪛 Ruff (0.11.9)

tests/test_recipes.py

5-5: os imported but unused

Remove unused import: os

(F401)

119-119: Do not perform function call pd.DataFrame in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

119-119: Do not perform function call range in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: pull request test

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

tests/test_recipes.py (3)
5-5: Remove unused os import

os is imported but not referenced anywhere in the file. Ruff F401 flags this and the import will fail the linter/CI step configured in the repo.
-import os
🧰 Tools

🪛 Ruff (0.11.9)

5-5: os imported but unused

Remove unused import: os

(F401)

84-90: Minor: simplify the dummy logger construction

Creating the dummy logger via type('log', (), {'write': lambda self, *a, **k: None})() works, but it is a little cryptic and re-creates the class on every call. A small, clearer helper class improves readability and avoids redefinition costs:
-class _SilentLog(type('log', (), {'write': lambda self, *a, **k: None})):
-    pass
+class _SilentLog:           # noqa: D401 – simple helper
+    def write(self, *_, **__):
+        # swallow all log calls
+        return None
...
-    r.log = type('log', (), {'write': lambda self, *a, **k: None})()
+    r.log = _SilentLog()
155-184: Avoid mutable/heavy objects in default-argument positions

Ruff B008 is correctly warning that pd.DataFrame(…) and range(…) are evaluated once at function definition time inside the default arguments of the lambdas you pass to validator. While harmless today, it can unexpectedly couple the test results to the first evaluation if the validator mutates df_orig.

A tiny refactor removes the smell without changing behaviour:
-    lambda res, df_orig=pd.DataFrame({'A': range(10), 'B': list('abcdefghij')}): (
+    lambda res, df_orig=None: (
+        (
+            lambda _df: set(res['A']) == set(_df['A'])
+            and set(res['B']) == set(_df['B'])
+        )(pd.DataFrame({'A': range(10), 'B': list('abcdefghij')}) if df_orig is None else df_orig)
     ),
(or pre-define DF_SHUFFLE_ORIG = pd.DataFrame(...) above the CASES list and capture it in the lambda).

This change silences the linter and follows best practices for default arguments.

🧰 Tools

🪛 Ruff (0.11.9)

159-159: Do not perform function call pd.DataFrame in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

159-159: Do not perform function call range in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

169-169: Do not perform function call pd.DataFrame in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

179-179: Do not perform function call pd.DataFrame in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between a08ed5c and 207b2a5.

📒 Files selected for processing (1)

tests/test_recipes.py (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

tests/test_recipes.py (1)

code/recipes.py (19)

fwf_format (77-78)

to_fwf (81-108)

internal_fillna (1622-1634)

select_columns (1596-1608)

internal_keep (1900-1914)

internal_rename (1697-1700)

internal_map (1702-1715)

internal_to_integer (1916-1927)

internal_to_float (1955-1970)

internal_shuffle (1717-1726)

internal_parsedate (2448-2459)

internal_normalize (2476-2483)

internal_pause (2485-2486)

internal_delete (2072-2085)

internal_replace (2461-2474)

internal_groupby (2087-2118)

internal_ngram (1972-1987)

internal_exec (1636-1642)

internal_eval (1644-1695)

🪛 Ruff (0.11.9)

tests/test_recipes.py

5-5: os imported but unused

Remove unused import: os

(F401)

159-159: Do not perform function call pd.DataFrame in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

159-159: Do not perform function call range in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

169-169: Do not perform function call pd.DataFrame in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

179-179: Do not perform function call pd.DataFrame in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: pull request test

🔇 Additional comments (2)

tests/test_recipes.py (2)

299-308: Verify the expected n-gram output

The assertion result['txt'].iloc[0][0] == 'he' assumes that recipes.internal_ngram returns character-level bigrams.
Looking at internal_ngram:
df[self.cols] = df[self.cols].applymap(
    lambda x: ngrams(tokenize(normalize(x)), n))
…tokenize(normalize(x)) suggests token (word) level splitting, so the first bigram for "hello world" would likely be "hello world" not "he". Please double-check the helper functions (tokenize, ngrams) and ensure the test expectation matches the real semantics, otherwise the test will be fragile.

If token-level bigrams are indeed produced, update the assertion accordingly:
-assert result['txt'].iloc[0][0] == 'he'
+assert result['txt'].iloc[0][0] == 'hello world'
15-27: Dynamic import helper is solid – nice job

The load_module utility cleanly inserts the target directory, loads the module, and restores sys.path in a finally block, avoiding global side-effects in other tests.

rhanka added the codex label May 19, 2025 — with ChatGPT Codex Connector

coderabbitai bot reviewed May 19, 2025

View reviewed changes

Integrate tests step in existing workflows

97985ec

rhanka force-pushed the codex/ajouter-tests-avec-pytest-yqynwr branch from 940c465 to 97985ec Compare May 20, 2025 00:48

coderabbitai bot reviewed May 20, 2025

View reviewed changes

add 20 basic unit tests

f45af24

coderabbitai bot reviewed May 21, 2025

View reviewed changes

consolidate tests for numpy further migration

a08ed5c

coderabbitai bot reviewed May 21, 2025

View reviewed changes

add 6 new tests

207b2a5

coderabbitai bot reviewed May 21, 2025

View reviewed changes

add 3 tests for pandas future migration

1706c7f

rhanka merged commit 2c642a4 into dev May 21, 2025
3 checks passed

rhanka deleted the codex/ajouter-tests-avec-pytest-yqynwr branch August 25, 2025 02:57

-      - name: tests
-        if: success()
-        run: make tests backend-stop GIT_BRANCH=$GIT_BRANCH
+      env:
+        GIT_BRANCH: ${{ steps.extract_branch.outputs.branch }}
+      - name: Run tests
+        run: make tests GIT_BRANCH=$GIT_BRANCH
+        env:
+          GIT_BRANCH: ${{ steps.extract_branch.outputs.branch }}
+      - name: Teardown backend
+        if: always()
+        run: make backend-stop GIT_BRANCH=$GIT_BRANCH
+        env:
+          GIT_BRANCH: ${{ steps.extract_branch.outputs.branch }}

Integrate tests into docker workflows #366

Integrate tests into docker workflows #366

Uh oh!

Conversation

rhanka commented May 19, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

rhanka commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎉 Snyk checks have passed. No issues have been found so far.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rhanka May 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot May 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot May 19, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot May 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rhanka commented May 19, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented May 19, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

rhanka commented May 19, 2025 •

edited

Loading

coderabbitai bot May 19, 2025 •

edited

Loading