Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misidentification of relevant coverage report causes tests to be thrown away #251

Open
j-ro opened this issue Dec 18, 2024 · 34 comments · Fixed by #254
Open

Misidentification of relevant coverage report causes tests to be thrown away #251

j-ro opened this issue Dec 18, 2024 · 34 comments · Fixed by #254
Labels
bug fix Something isn't working

Comments

@j-ro
Copy link

j-ro commented Dec 18, 2024

There is a bug when matching a test run with the targeted file's coverage report where, if there is more than one file with the same name endings (different paths), the matched coverage report will not be the file we are targeting, causing an erroneous report that coverage did not increase, and thus causing the tests to be thrown out.

For example, my coverage.xml file might look like this (this is a Ruby/Rails project):

[snip]
        <class name="has_report_invite" filename="app/models/concerns/has_report_invite.rb" line-rate="0.32" branch-rate="0" complexity="0">
          <methods/>
          <lines>
            <line number="1" branch="false" hits="1"/>
            <line number="2" branch="false" hits="1"/>
            <line number="4" branch="false" hits="1"/>
            <line number="5" branch="false" hits="0"/>
            <line number="8" branch="false" hits="1"/>
            <line number="9" branch="false" hits="0"/>
            <line number="12" branch="false" hits="1"/>
            <line number="13" branch="false" hits="0"/>
            <line number="16" branch="false" hits="1"/>
            <line number="17" branch="false" hits="0"/>
            <line number="20" branch="false" hits="1"/>
            <line number="21" branch="false" hits="0"/>
            <line number="22" branch="false" hits="0"/>
            <line number="24" branch="false" hits="0"/>
            <line number="25" branch="false" hits="0"/>
            <line number="29" branch="false" hits="0"/>
            <line number="32" branch="false" hits="0"/>
            <line number="33" branch="false" hits="0"/>
            <line number="34" branch="false" hits="0"/>
            <line number="35" branch="false" hits="0"/>
            <line number="39" branch="false" hits="0"/>
            <line number="44" branch="false" hits="0"/>
          </lines>
        </class>
        <class name="report_invite" filename="app/models/report_invite.rb" line-rate="0.36" branch-rate="0" complexity="0">
          <methods/>
          <lines>
            <line number="1" branch="false" hits="1"/>
            <line number="2" branch="false" hits="1"/>
            <line number="4" branch="false" hits="1"/>
            <line number="5" branch="false" hits="0"/>
            <line number="8" branch="false" hits="1"/>
            <line number="9" branch="false" hits="0"/>
            <line number="10" branch="false" hits="0"/>
            <line number="11" branch="false" hits="0"/>
            <line number="13" branch="false" hits="0"/>
            <line number="14" branch="false" hits="0"/>
            <line number="15" branch="false" hits="0"/>
            <line number="17" branch="false" hits="0"/>
            <line number="20" branch="false" hits="0"/>
            <line number="23" branch="false" hits="1"/>
            <line number="24" branch="false" hits="0"/>
            <line number="25" branch="false" hits="0"/>
            <line number="26" branch="false" hits="0"/>
            <line number="27" branch="false" hits="0"/>
            <line number="28" branch="false" hits="0"/>
            <line number="29" branch="false" hits="0"/>
            <line number="34" branch="false" hits="1"/>
            <line number="35" branch="false" hits="0"/>
            <line number="38" branch="false" hits="1"/>
            <line number="39" branch="false" hits="1"/>
            <line number="42" branch="false" hits="1"/>
            <line number="43" branch="false" hits="0"/>
            <line number="46" branch="false" hits="1"/>
            <line number="47" branch="false" hits="0"/>
          </lines>
        </class>
[snip]

As you can see, there are two files that end in "report_invite" in this report. As they are related to each other, both are exercised when the one target test file (in this case, "spec/models/report_invite_spec.rb") is run. So they both show up in this report.

Here is the command I'm using:

cover-agent \
  --source-file-path "app/models/report_invite.rb" \
  --test-file-path "spec/models/report_invite_spec.rb" \
  --code-coverage-report-path "coverage/coverage.xml" \
  --test-command "docker-compose exec -e RAILS_ENV=test web rspec spec/models/report_invite_spec.rb" \
  --additional-instructions "" \
  --included-files spec/factories/report_invites.rb app/mailers/report_invite_mailer.rb lib/tasks/report_invites.rake app/models/concerns/has_report_invite.rb  \
 --desired-coverage 100 \
  --model "gpt-4o"

As you can see from the above, the target file is "app/models/report_invite.rb". And if you manually look at the coverage report, you can see that it is 36% covered right now.

However, cover-agent reports that it is only 32% covered:

Streaming results from LLM model...
yaml
language: ruby
testing_framework: rspec
number_of_tests: 1
test_headers_indentation: 4


Streaming results from LLM model...
yaml
language: ruby
testing_framework: rspec
number_of_tests: 1
relevant_line_number_to_insert_tests_after: 8
relevant_line_number_to_insert_imports_after: 1


2024-12-18 11:06:22,710 - cover_agent.UnitTestValidator - INFO - Running build/test command to generate coverage report: "docker-compose exec -e RAILS_ENV=test web rspec spec/models/report_invite_spec.rb"
2024-12-18 11:06:35,309 - cover_agent.UnitTestValidator - INFO - Initial coverage: 31.82%
2024-12-18 11:06:35,315 - cover_agent.CoverAgent - INFO - Current Coverage: 31.82%
2024-12-18 11:06:35,315 - cover_agent.CoverAgent - INFO - Desired Coverage: 100%

Digging in, you can see it is matching to the "app/models/concerns/has_report_invite.rb" file, not the actual target, which has that percentage covered.

Because cover-agent is reading the wrong file for its coverage report, it throws out tests it generates as not increasing coverage, causing a lot of wasted time and expensive tokens.

I'm guessing the issue is with this line: https://github.com/qodo-ai/qodo-cover/blob/main/cover_agent/CoverageProcessor.py#L131

The fix is probably to try and match on the path, rather than just the partial file name, as it assumes all file names in a coverage report are unique (or don't share endings), when that is only true in a given path.

If I get some time I'll try and make a PR here, but wanted to post this bug first to see if folks had feedback on how to approach it.

@coderustic
Copy link
Contributor

@j-ro I am working on a fix. Right now as you analyzed cover_agent looks at the end of the file name and could have ambiguity as found in this case. I am working on using the name field instead of the filename filed. I could also use the full path as you suggested, but it is possible that sometimes multiple class files can be in a single source file. If I have a branch would you be able to test it and provide feedback if its working for your case?

@j-ro
Copy link
Author

j-ro commented Jan 3, 2025 via email

coderustic added a commit to coderustic/cover-agent that referenced this issue Jan 5, 2025
* While earlier PR[qodo-ai#230] managed to breakdown processing
  code into a class hierarechy, there wasnt any changes
  made to the code. This PR brings in enhancements to
  coverage processing where coverage data is stored by
  entity (Class or File).

* Coverage data is stored using a FQDN so that conflicts
  are taken care. This closes[qodo-ai#251]

* Earlier PR broke the behaviour of the agent that only
  target file coverage is considered if the global coverage
  flag is not set by the user, this PR fixes it to bring
  back the original behaviour.
@coderustic
Copy link
Contributor

@j-ro I made some efforts to fix this issue. Here in this PR [#254 ]. Appreciate your help in validating these changes.

@j-ro
Copy link
Author

j-ro commented Jan 5, 2025

@coderustic awesome, thanks! I'll try to give it a try this week!

@j-ro
Copy link
Author

j-ro commented Jan 6, 2025

@coderustic any tips for getting it to run from source?

I'm working on poetry install. It failed on my machine (M2 OSX Sequoia) on the litellm package. I could get it to complete by subbing out the codium pin for the standard one (https://github.com/BerriAI/litellm).

But trying to run it errors:

poetry run cover-agent -help
The currently activated Python version 3.13.1 is not supported by the project (>=3.9,<3.13).
Trying to find and use a compatible version. 
Using python3.9 (3.9.21)

'format'

I'm not a python dev, so I may be not doing something obvious. However the readme doesn't show any examples of how to actually run the CLI if you install it from source as far as I can tell.

@coderustic
Copy link
Contributor

@j-ro Looks like main is broken and we use a fork of the litellm. Will let you know as soon as the build is fixed.

coderustic added a commit to coderustic/cover-agent that referenced this issue Jan 7, 2025
* While earlier PR[qodo-ai#230] managed to breakdown processing
  code into a class hierarechy, there wasnt any changes
  made to the code. This PR brings in enhancements to
  coverage processing where coverage data is stored by
  entity (Class or File).

* Coverage data is stored using a FQDN so that conflicts
  are taken care. This closes[qodo-ai#251]

* Earlier PR broke the behaviour of the agent that only
  target file coverage is considered if the global coverage
  flag is not set by the user, this PR fixes it to bring
  back the original behaviour.
coderustic added a commit to coderustic/cover-agent that referenced this issue Jan 7, 2025
* While earlier PR[qodo-ai#230] managed to breakdown processing
  code into a class hierarechy, there wasnt any changes
  made to the code. This PR brings in enhancements to
  coverage processing where coverage data is stored by
  entity (Class or File).

* Coverage data is stored using a FQDN so that conflicts
  are taken care. This closes[qodo-ai#251]

* Earlier PR broke the behaviour of the agent that only
  target file coverage is considered if the global coverage
  flag is not set by the user, this PR fixes it to bring
  back the original behaviour.
@EmbeddedDevops1 EmbeddedDevops1 linked a pull request Jan 7, 2025 that will close this issue
EmbeddedDevops1 pushed a commit that referenced this issue Jan 7, 2025
* Enhanced coverage processing (#2)

* While earlier PR[#230] managed to breakdown processing
  code into a class hierarechy, there wasnt any changes
  made to the code. This PR brings in enhancements to
  coverage processing where coverage data is stored by
  entity (Class or File).

* Coverage data is stored using a FQDN so that conflicts
  are taken care. This closes[#251]

* Earlier PR broke the behaviour of the agent that only
  target file coverage is considered if the global coverage
  flag is not set by the user, this PR fixes it to bring
  back the original behaviour.

* removed sample-reports

* bump version
@EmbeddedDevops1
Copy link
Collaborator

@j-ro Looks like main is broken and we use a fork of the litellm. Will let you know as soon as the build is fixed.

CI Build was fixed but we had some failures in the regression tests when I merged this so I had to revert it. Apologies.

@j-ro
Copy link
Author

j-ro commented Jan 9, 2025

no worries, let me know when I should give it a try (and what branch to use!)

@j-ro
Copy link
Author

j-ro commented Jan 11, 2025

@coderustic wonder if you have an update here? Right now because I'm not able to run with poetry run, I'm unable to use the repo at all, since I removed the pipx install I had. Even if this PR isn't ready, do you have a path to getting the direct repo running? Is main still broken?

@coderustic
Copy link
Contributor

@j-ro main has been fixed with the earlier issue with lite-llm. You can give a try either installing new version which is 0.2.15 or using the main latest.

@j-ro
Copy link
Author

j-ro commented Jan 11, 2025

Thanks! Forgive me for my python ignorance, but how do I run with main latest? My git repo is up to date, but poetry run still errors:

poetry -vvv run cover-agent -help
Trying to detect current active python executable as specified in the config.
Unable to detect the current active python executable. Falling back to default.
Trying to detect current active python executable as specified in the config.
Unable to detect the current active python executable. Falling back to default.
The currently activated Python version 3.13.1 is not supported by the project (>=3.9,<3.13).
Trying to find and use a compatible version. 
Trying python3
Trying python3.9
Using python3.9 (3.9.21)
Virtualenv cover-agent-QB1wIuOP-py3.9 already exists.
Using virtualenv: /Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.9

  Stack trace:

  10  /usr/local/Cellar/poetry/2.0.0/libexec/lib/python3.13/site-packages/cleo/application.py:327 in run
       325│ 
       326│             try:
     → 327│                 exit_code = self._run(io)
       328│             except BrokenPipeError:
       329│                 # If we are piped to another process, it may close early and send a

   9  /usr/local/Cellar/poetry/2.0.0/libexec/lib/python3.13/site-packages/poetry/console/application.py:236 in _run
       234│ 
       235│         with directory(self._working_directory):
     → 236│             exit_code: int = super()._run(io)
       237│ 
       238│         return exit_code

   8  /usr/local/Cellar/poetry/2.0.0/libexec/lib/python3.13/site-packages/cleo/application.py:431 in _run
       429│             io.input.interactive(interactive)
       430│ 
     → 431│         exit_code = self._run_command(command, io)
       432│         self._running_command = None
       433│ 

   7  /usr/local/Cellar/poetry/2.0.0/libexec/lib/python3.13/site-packages/cleo/application.py:473 in _run_command
       471│ 
       472│         if error is not None:
     → 473│             raise error
       474│ 
       475│         return terminate_event.exit_code

   6  /usr/local/Cellar/poetry/2.0.0/libexec/lib/python3.13/site-packages/cleo/application.py:457 in _run_command
       455│ 
       456│             if command_event.command_should_run():
     → 457│                 exit_code = command.run(io)
       458│             else:
       459│                 exit_code = ConsoleCommandEvent.RETURN_CODE_DISABLED

   5  /usr/local/Cellar/poetry/2.0.0/libexec/lib/python3.13/site-packages/cleo/commands/base_command.py:117 in run
       115│         io.input.validate()
       116│ 
     → 117│         return self.execute(io) or 0
       118│ 
       119│     def merge_application_definition(self, merge_args: bool = True) -> None:

   4  /usr/local/Cellar/poetry/2.0.0/libexec/lib/python3.13/site-packages/cleo/commands/command.py:61 in execute
        59│ 
        60│         try:
     →  61│             return self.handle()
        62│         except KeyboardInterrupt:
        63│             return 1

   3  /usr/local/Cellar/poetry/2.0.0/libexec/lib/python3.13/site-packages/poetry/console/commands/run.py:31 in handle
        29│ 
        30│         if scripts and script in scripts:
     →  31│             return self.run_script(scripts[script], args)
        32│ 
        33│         try:

   2  /usr/local/Cellar/poetry/2.0.0/libexec/lib/python3.13/site-packages/poetry/console/commands/run.py:77 in run_script
        75│         module, callable_ = script.split(":")
        76│ 
     →  77│         src_in_sys_path = "sys.path.append('src'); " if self._module.is_in_src() else ""
        78│ 
        79│         cmd = ["python", "-c"]

   1  /usr/local/Cellar/poetry/2.0.0/libexec/lib/python3.13/site-packages/poetry/console/commands/run.py:46 in _module
        44│         package = poetry.package
        45│         path = poetry.file.path.parent
     →  46│         module = Module(package.name, path.as_posix(), package.packages)
        47│ 
        48│         return module

  KeyError

  'format'

  at /usr/local/Cellar/poetry/2.0.0/libexec/lib/python3.13/site-packages/poetry/core/masonry/utils/module.py:79 in __init__
       75│             self._package_includes.append(
       76│                 PackageInclude(
       77│                     self._path,
       78│                     package["include"],
    →  79│                     formats=package["format"],
       80│                     source=package.get("from"),
       81│                     target=package.get("to"),
       82│                 )
       83│      

@coderustic
Copy link
Contributor

I am sorry but looks like every one over github is merging fixes thats breaking everywhere, see this python-poetry/poetry#9961
What version of poetry you have? I am using Poetry (version 1.8.4)

$ poetry -vvv run cover-agent -help
Using virtualenv: /home/sai/.cache/pypoetry/virtualenvs/cover-agent-vDosZXUS-py3.12
/home/sai/.cache/pypoetry/virtualenvs/cover-agent-vDosZXUS-py3.12/lib/python3.12/site-packages/pydantic/_internal/_config.py:341: UserWarning: Valid config keys have changed in V2:
* 'fields' has been removed
  warnings.warn(message, UserWarning)
usage: cover-agent [-h] --source-file-path SOURCE_FILE_PATH --test-file-path TEST_FILE_PATH [--project-root PROJECT_ROOT] [--test-file-output-path TEST_FILE_OUTPUT_PATH] --code-coverage-report-path CODE_COVERAGE_REPORT_PATH --test-command
                   TEST_COMMAND [--test-command-dir TEST_COMMAND_DIR] [--included-files [INCLUDED_FILES ...]] [--coverage-type COVERAGE_TYPE] [--report-filepath REPORT_FILEPATH] [--desired-coverage DESIRED_COVERAGE] [--max-iterations MAX_ITERATIONS]
                   [--additional-instructions ADDITIONAL_INSTRUCTIONS] [--model MODEL] [--api-base API_BASE] [--strict-coverage] [--run-tests-multiple-times RUN_TESTS_MULTIPLE_TIMES] [--log-db-path LOG_DB_PATH] [--branch BRANCH]
                   [--use-report-coverage-feature-flag | --diff-coverage] [--run-each-test-separately RUN_EACH_TEST_SEPARATELY]

Cover Agent v0.2.15

options:
  -h, --help            show this help message and exit
  --source-file-path SOURCE_FILE_PATH
                        Path to the source file.
  --test-file-path TEST_FILE_PATH
                        Path to the input test file.
  --project-root PROJECT_ROOT
                        Path to the root of the project.
  --test-file-output-path TEST_FILE_OUTPUT_PATH
                        Path to the output test file.
  --code-coverage-report-path CODE_COVERAGE_REPORT_PATH
                        Path to the code coverage report file.
  --test-command TEST_COMMAND
                        The command to run tests and generate coverage report.
  --test-command-dir TEST_COMMAND_DIR
                        The directory to run the test command in. Default: /home/sai/workspace/python/cover-agent.
  --included-files [INCLUDED_FILES ...]
                        List of files to include in the coverage. For example, "--included-files library1.c library2.c." Default: None.
  --coverage-type COVERAGE_TYPE
                        Type of coverage report. Default: cobertura.
  --report-filepath REPORT_FILEPATH
                        Path to the output report file. Default: test_results.html.
  --desired-coverage DESIRED_COVERAGE
                        The desired coverage percentage. Default: 90.
  --max-iterations MAX_ITERATIONS
                        The maximum number of iterations. Default: 10.
  --additional-instructions ADDITIONAL_INSTRUCTIONS
                        Any additional instructions you wish to append at the end of the prompt. Default: .
  --model MODEL         Which LLM model to use. Default: gpt-4o.
  --api-base API_BASE   The API url to use for Ollama or Hugging Face. Default: http://localhost:11434.
  --strict-coverage     If set, Cover-Agent will return a non-zero exit code if the desired code coverage is not achieved. Default: False.
  --run-tests-multiple-times RUN_TESTS_MULTIPLE_TIMES
                        Number of times to run the tests generated by Cover Agent. Default: 1.
  --log-db-path LOG_DB_PATH
                        Path to optional log database. Default: .
  --branch BRANCH       The branch to compare against when using --diff-coverage. Default: main.
  --use-report-coverage-feature-flag
                        Setting this to True considers the coverage of all the files in the coverage report. This means we consider a test as good if it increases coverage for a different file other than the source file. Default: False. Not
                        compatible with --diff-coverage.
  --diff-coverage       If set, Cover-Agent will only generate tests based on the diff between branches. Default: False. Not compatible with --use-report-coverage-feature-flag.
  --run-each-test-separately RUN_EACH_TEST_SEPARATELY
                        Run each test separately. Default: False

@coderustic
Copy link
Contributor

Btw why dont you go back and try using pipx installed version?

@j-ro
Copy link
Author

j-ro commented Jan 11, 2025

I'm using poetry version 2.0.0

I can go back to pipx, I just wouldn't be able to try this branch right? Or can I install branches with pipx?

@coderustic
Copy link
Contributor

Thanks for being supportive and offering to help. I dont think this branch is ready yet, but I will try a different approach for you to test next time when my branch is ready.

@j-ro
Copy link
Author

j-ro commented Jan 11, 2025

ok! I downgraded poetry and I have a different error now, but maybe I should just go back to pipx instead:

poetry run -vvv cover-agent --help
The currently activated Python version 3.13.1 is not supported by the project (>=3.9,<3.13).
Trying to find and use a compatible version. 
Trying python3
Trying python3.9
Using python3.9 (3.9.21)
Virtualenv cover-agent-QB1wIuOP-py3.9 already exists.
Using virtualenv: /Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.9
/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.9/lib/python3.9/site-packages/pydantic/_internal/_config.py:341: UserWarning: Valid config keys have changed in V2:
* 'fields' has been removed
  warnings.warn(message, UserWarning)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/Cellar/[email protected]/3.9.21/Frameworks/Python.framework/Versions/3.9/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/main.py", line 3, in <module>
    from cover_agent.CoverAgent import CoverAgent
  File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/CoverAgent.py", line 13, in <module>
    from cover_agent.UnitTestValidator import UnitTestValidator
  File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/UnitTestValidator.py", line 16, in <module>
    from cover_agent.coverage.processor import process_coverage, CoverageReport, CoverageData
  File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/coverage/processor.py", line 215, in <module>
    class JacocoProcessor(CoverageProcessor):
  File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/coverage/processor.py", line 239, in JacocoProcessor
    def _get_file_extension(self, filename: str) -> str | None:
TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'

@coderustic
Copy link
Contributor

This is due to pydantic version. We have a deprecated warning currently which is probably removed in Pydantic version 2. I suggest you try using python 3.12 (if not use the pipx installed version so you dont have to deal with setting up a dev environment)

@j-ro
Copy link
Author

j-ro commented Jan 11, 2025

ok great, poetry seems to be working at least somewhat. Still more work to do on my end but I may be able to get back running that way. If you have a branch to test at some point or something else do let me know!

@j-ro
Copy link
Author

j-ro commented Jan 12, 2025

I'm also realizing as I test the main branch with poetry that the new (and fairly undocumented) use-report-coverage-feature-flag option probably obviates this issue if you use it, since it'll consider the entire coverage report and not just a part of it when deciding whether to keep tests or not.

@coderustic
Copy link
Contributor

Yes, you are spot on (though I would name the flag as consider_global_coverage. I was in a middle of refactoring and as of now on main the default behavior is to consider global coverage (which I am embarrassed to say it out loud). My part two of the refactoring is bringing the default behaviour to false and also add fqdn to the class names. See if on main it will work even without setting the flag.

@j-ro
Copy link
Author

j-ro commented Jan 12, 2025

I will try it when I can, interesting!

@coderustic
Copy link
Contributor

You might see that on the main file 1 data is overwritten by file 2 data (which is being fixed by #264) so it will be good to wait for few days and when I see all in progress work is merged I can let you know.

@j-ro
Copy link
Author

j-ro commented Jan 12, 2025

sounds good!

@j-ro
Copy link
Author

j-ro commented Jan 12, 2025

Though I did find kind of a weird behavior with the new option -- when we hit 100% coverage for the target file the loop keeps going until it runs out of iterations. That seems a bit unexpected? I can open another issue for that if you want.

@EmbeddedDevops1
Copy link
Collaborator

@j-ro thanks for pointing that out. @coderustic is going to address that in the next PR. Really really appreciate all the thorough feedback.

@EmbeddedDevops1
Copy link
Collaborator

FYI, I've been working on trying to get Python > 3.12 working but there are some libraries that haven't been playing nicely in the pyproject.toml. I'll create an issue for it for posterity.

@EmbeddedDevops1 EmbeddedDevops1 added the bug fix Something isn't working label Jan 12, 2025
@j-ro
Copy link
Author

j-ro commented Feb 5, 2025

Any update on this one? Without stopping on 100% coverage, the tool is way less automatic (I have to monitor it and stop it manually unless I want to waste a lot of money on LLMs until we hit the iteration timeout)

@EmbeddedDevops1
Copy link
Collaborator

Have you tested it recently? We did some releases since and had some refactoring "challenges" as of late...

@j-ro
Copy link
Author

j-ro commented Feb 11, 2025

@EmbeddedDevops1 yeah, I'm up to date with main, and with the use-report-coverage-feature-flag option, it is not stopping when the specified source file reaches 100% coverage.

@EmbeddedDevops1
Copy link
Collaborator

Okay, let me see if I can reproduce the issue and get on it. Sorry for the delay.

@j-ro
Copy link
Author

j-ro commented Feb 11, 2025

no worries, thanks!

@j-ro
Copy link
Author

j-ro commented Feb 12, 2025

reading the code a bit, my guess is the issue is somewhere here:

for key in new_coverage_percentages:

It seems like this might be a similar issue to my earlier report. My log always talks about a "non-source file" having coverage increasing, but that non-source file is actually my source file. So perhaps the lookup by path here has a similar bug to what I reported above.

@kelper-hub
Copy link
Contributor

Hey @j-ro, nice to meet you. I was taking a look at this issue the other day, and I wanted to reach out.

I think I have a working solution for the infinite continuation of tests when running in the --use-report-coverage-feature-flag mode, I was unable to replicate the original issue in this thread.

I modified the python template test locally, with similar if not the same directory names as you posted above, but was unable to reproduce what behavior you're seeing when running in your ruby project. My testing has indicated that the file path should be the key that is associated with the new coverage percentages, so there shouldn't be any cross between values.

What I did notice is that in my testing, the percentage are printed in the xml file in alphabetical order based on the key (file path), and on your posted example here, that isn't the case. I was wondering if maybe there is some coverage version mismatch that we aren't testing for. Can you provide a little more detail into what you're seeing? Coverage version, maybe some more insight into how you're structuring your app? I am not a ruby developer so apologies in advance for not being too knowledgeable.

I've attached my test logs and coverage xml for your to review as well, and see how things line up.

Command run:

poetry run cover-agent
   --source-file-path "templated_tests/python_fastapi/app/models/report_invite.py" 
   --test-file-path "templated_tests/python_fastapi/test_app2.py" 
   --project-root "templated_tests/python_fastapi"
   --code-coverage-report-path "templated_tests/python_fastapi/coverage.xml" 
   --test-command "pytest --cov=. --cov-report=xml --cov-report=term" 
   --test-command-dir "templated_tests/python_fastapi" 
   --included-files "templated_tests/python_fastapi/app/models/concerns/has_report_invite.py" 
   --coverage-type "cobertura" 
   --desired-coverage 100

Sample logs:

2025-02-18 23:25:49,194 - cover_agent.UnitTestValidator - INFO - Running test with the following command: "pytest --cov=. --cov-report=xml --cov-report=term"
2025-02-18 23:25:52,525 - cover_agent.UnitTestValidator - INFO - Test passed and coverage increased. Current coverage: 93.48%
2025-02-18 23:25:52,532 - cover_agent.UnitTestValidator - INFO - Running test with the following command: "pytest --cov=. --cov-report=xml --cov-report=term"
2025-02-18 23:25:56,014 - cover_agent.UnitTestValidator - INFO - Test passed and coverage increased. Current coverage: 95.65%
2025-02-18 23:25:56,020 - cover_agent.UnitTestValidator - INFO - Running build/test command to generate coverage report: "pytest --cov=. --cov-report=xml --cov-report=term"
2025-02-18 23:25:59,429 - cover_agent.UnitTestValidator - INFO - Initial coverage: 95.65%
2025-02-18 23:25:59,429 - cover_agent.CoverAgent - INFO - Current Coverage: 95.65%
2025-02-18 23:25:59,429 - cover_agent.CoverAgent - INFO - Desired Coverage: 100%
Streaming results from LLM model…
<?xml version="1.0" ?>
<coverage version="7.6.10" timestamp="1739949972053" lines-valid="128" lines-covered="127" line-rate="0.9922" branches-covered="0" branches-valid="0" branch-rate="0" complexity="0">
	<!-- Generated by coverage.py: https://coverage.readthedocs.io/en/7.6.10 -->
	<!-- Based on https://raw.githubusercontent.com/cobertura/web/master/htdocs/xml/coverage-04.dtd -->
	<sources>
		<source>/Users/braxton/Documents/Code/consulting/cover-agent/templated_tests/python_fastapi</source>
	</sources>
	<packages>
		<package name="." line-rate="1" branch-rate="0" complexity="0">
			<classes>
				<class name="test_app1.py" filename="test_app1.py" complexity="0" line-rate="1" branch-rate="0">
					<methods/>
					<lines>
						<line number="1" hits="1"/>
						<line number="2" hits="1"/>
						<line number="3" hits="1"/>
						<line number="4" hits="1"/>
						<line number="6" hits="1"/>
						<line number="9" hits="1"/>
						<line number="13" hits="1"/>
						<line number="14" hits="1"/>
						<line number="15" hits="1"/>
					</lines>
				</class>
				<class name="test_app2.py" filename="test_app2.py" complexity="0" line-rate="1" branch-rate="0">
					<methods/>
					<lines>
						<line number="1" hits="1"/>
						<line number="2" hits="1"/>
						<line number="3" hits="1"/>
						<line number="4" hits="1"/>
						<line number="6" hits="1"/>
						<line number="9" hits="1"/>
						<line number="13" hits="1"/>
						<line number="14" hits="1"/>
						<line number="15" hits="1"/>
						<line number="18" hits="1"/>
						<line number="22" hits="1"/>
						<line number="23" hits="1"/>
						<line number="24" hits="1"/>
						<line number="27" hits="1"/>
						<line number="28" hits="1"/>
						<line number="29" hits="1"/>
						<line number="30" hits="1"/>
						<line number="33" hits="1"/>
						<line number="34" hits="1"/>
						<line number="35" hits="1"/>
						<line number="36" hits="1"/>
						<line number="39" hits="1"/>
						<line number="40" hits="1"/>
						<line number="41" hits="1"/>
						<line number="42" hits="1"/>
						<line number="43" hits="1"/>
						<line number="46" hits="1"/>
						<line number="47" hits="1"/>
						<line number="48" hits="1"/>
						<line number="49" hits="1"/>
						<line number="52" hits="1"/>
						<line number="53" hits="1"/>
						<line number="54" hits="1"/>
						<line number="57" hits="1"/>
						<line number="58" hits="1"/>
						<line number="59" hits="1"/>
						<line number="60" hits="1"/>
						<line number="63" hits="1"/>
						<line number="64" hits="1"/>
						<line number="65" hits="1"/>
						<line number="66" hits="1"/>
						<line number="69" hits="1"/>
						<line number="70" hits="1"/>
						<line number="71" hits="1"/>
						<line number="72" hits="1"/>
						<line number="75" hits="1"/>
						<line number="76" hits="1"/>
						<line number="77" hits="1"/>
						<line number="78" hits="1"/>
						<line number="81" hits="1"/>
						<line number="82" hits="1"/>
						<line number="83" hits="1"/>
						<line number="84" hits="1"/>
						<line number="87" hits="1"/>
						<line number="88" hits="1"/>
						<line number="89" hits="1"/>
						<line number="90" hits="1"/>
						<line number="93" hits="1"/>
						<line number="94" hits="1"/>
						<line number="95" hits="1"/>
						<line number="96" hits="1"/>
						<line number="99" hits="1"/>
						<line number="100" hits="1"/>
						<line number="101" hits="1"/>
						<line number="102" hits="1"/>
					</lines>
				</class>
			</classes>
		</package>
		<package name="app.models" line-rate="1" branch-rate="0" complexity="0">
			<classes>
				<class name="report_invite.py" filename="app/models/report_invite.py" complexity="0" line-rate="1" branch-rate="0">
					<methods/>
					<lines>
						<line number="1" hits="1"/>
						<line number="2" hits="1"/>
						<line number="3" hits="1"/>
						<line number="5" hits="1"/>
						<line number="8" hits="1"/>
						<line number="9" hits="1"/>
						<line number="15" hits="1"/>
						<line number="18" hits="1"/>
						<line number="19" hits="1"/>
						<line number="23" hits="1"/>
						<line number="26" hits="1"/>
						<line number="27" hits="1"/>
						<line number="31" hits="1"/>
						<line number="34" hits="1"/>
						<line number="35" hits="1"/>
						<line number="46" hits="1"/>
						<line number="49" hits="1"/>
						<line number="50" hits="1"/>
						<line number="61" hits="1"/>
						<line number="64" hits="1"/>
						<line number="65" hits="1"/>
						<line number="76" hits="1"/>
						<line number="77" hits="1"/>
						<line number="78" hits="1"/>
						<line number="81" hits="1"/>
						<line number="82" hits="1"/>
						<line number="86" hits="1"/>
						<line number="89" hits="1"/>
						<line number="90" hits="1"/>
						<line number="94" hits="1"/>
						<line number="95" hits="1"/>
						<line number="98" hits="1"/>
						<line number="101" hits="1"/>
						<line number="102" hits="1"/>
						<line number="106" hits="1"/>
						<line number="109" hits="1"/>
						<line number="110" hits="1"/>
						<line number="114" hits="1"/>
						<line number="115" hits="1"/>
						<line number="116" hits="1"/>
						<line number="117" hits="1"/>
						<line number="120" hits="1"/>
						<line number="121" hits="1"/>
						<line number="125" hits="1"/>
					</lines>
				</class>
			</classes>
		</package>
		<package name="app.models.concerns" line-rate="0.9" branch-rate="0" complexity="0">
			<classes>
				<class name="has_report_invite.py" filename="app/models/concerns/has_report_invite.py" complexity="0" line-rate="0.9" branch-rate="0">
					<methods/>
					<lines>
						<line number="1" hits="1"/>
						<line number="2" hits="1"/>
						<line number="3" hits="1"/>
						<line number="5" hits="1"/>
						<line number="8" hits="1"/>
						<line number="9" hits="1"/>
						<line number="15" hits="1"/>
						<line number="17" hits="1"/>
						<line number="18" hits="1"/>
						<line number="22" hits="0"/>
					</lines>
				</class>
			</classes>
		</package>
	</packages>
</coverage>

@j-ro
Copy link
Author

j-ro commented Feb 19, 2025

@kelper-hub huh interesting! thanks for helping me dig in here.

Can you say more about what you mean by coverage version? This is a ruby on rails app. The coverage.xml file is being generated by the simplecov-cobertura gem, version 1.4.2 (https://github.com/dashingrocket/simplecov-cobertura)

It doesn't seem to have a lot of options so far as I can tell.

In terms of your log, mine looks similar. Here's an example of a run that I manually stopped when the target file reached 100% but didn't stop automatically.

My command was this:

poetry --directory ~/CAN/site/qodo-cover run cover-agent \
  --source-file-path "app/workers/ladder_stats_worker.rb" \
  --test-file-path "spec/workers/ladder_stats_worker_spec.rb" \
  --code-coverage-report-path "coverage/coverage.xml" \
  --test-command "docker compose exec -e RAILS_ENV=test -e SPEC_RUN_OK=true  web rspec spec/workers/ladder_stats_worker_spec.rb" \
  --additional-instructions "act as an expert ruby programmer. focus on increasing test coverage of parts of the code that have the least. if a test you write errors, try to fix the error. use factories in the factories directory where useful. insert tests before the last end statement. IF A COMMENT IS PRESENT TELLING YOU TO #insert tests here INSERT TESTS AFTER THIS LINE! add in let statements as necessary for test setup. avoid manually creating IDs for objects, rather, use factories to create objects and use the objects returned in your tests. When using factories, use FactoryGirl, NOT FactoryBot syntax. Please mock external services like redis, elasticsearch, etc... especially if they error when you try and write a test. Be creative when trying to expand coverage, look at the source file for methods that haven't been tested or lines of code not exercised, rather than expanding only on tests that already exist. If the test file is basically blank, think about using factories and lets to set up things from scratch.  Use a dummy class and other mocks to get concerns working without a controller. all methods from the support folder outside of the it context. For basically blank test files, be careful and think step by step, building up context with mocks and lets needed to test methods. If a task is supposed to run in a neverending loop during normal operation, make sure to clean up after a test by interrupting the loop and stopping the task to avoid timeouts. for share option factories, make sure you create an action (event, petition, form, etc...) via factories to use with the share option factory. You are already within an RSpec.describe block, so DO NOT use RSpec.describe in your tests. Instead, simply use describe to start test blocks. DO NOT overwrite $redis, $redis_stats, or $redis_worker with a double, rather, use them as is and allow them to receive what you need.  PREFIX EACH AND EVERY TEST WITH THIS COMMENT: '# generated with AI by cover-agent (https://github.com/Codium-ai/cover-agent)'" \
  --included-files spec/support/* spec/factories/ladders.rb \
 --desired-coverage 100  \
  --model "o3-mini" \
  --use-report-coverage-feature-flag

My starting coverage.xml file (with just rails test scaffolds, no actual tests) was this:

https://drive.google.com/file/d/11_wjLskwoCRwSGkCyUQahl7rzu9mBd_y/view?usp=sharing

My reading of that file is that my target file has a 67% coverage rate to start (it's a very small file with a lot of boilerplate, so this is expected). Overall there is a coverage rate of 14% of every file touched by this spec run. Rails pulls in lots of extra files, so again, not unexpected.

Here's the log of my run:

/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/pydantic/_internal/_config.py:345: UserWarning: Valid config keys have changed in V2:
* 'fields' has been removed
  warnings.warn(message, UserWarning)
Error reading file spec/support/files: [Errno 21] Is a directory: 'spec/support/files'
Error reading file spec/support/shared_contexts: [Errno 21] Is a directory: 'spec/support/shared_contexts'
Error reading file spec/support/shared_examples: [Errno 21] Is a directory: 'spec/support/shared_examples'
Error reading file spec/support/files: [Errno 21] Is a directory: 'spec/support/files'
Error reading file spec/support/shared_contexts: [Errno 21] Is a directory: 'spec/support/shared_contexts'
Error reading file spec/support/shared_examples: [Errno 21] Is a directory: 'spec/support/shared_examples'
Error reading file spec/support/files: [Errno 21] Is a directory: 'spec/support/files'
Error reading file spec/support/shared_contexts: [Errno 21] Is a directory: 'spec/support/shared_contexts'
Error reading file spec/support/shared_examples: [Errno 21] Is a directory: 'spec/support/shared_examples'
Printing results from LLM model...
```yaml
language: Ruby
testing_framework: RSpec
number_of_tests: 0
test_headers_indentation: 0

Printing results from LLM model...

language: Ruby
testing_framework: RSpec
number_of_tests: 0
relevant_line_number_to_insert_tests_after: 4
relevant_line_number_to_insert_imports_after: 1

2025-02-19 06:40:22,610 - cover_agent.UnitTestValidator - INFO - Running build/test command to generate coverage report: "docker compose exec -e RAILS_ENV=test -e SPEC_RUN_OK=true web rspec spec/workers/ladder_stats_worker_spec.rb"
2025-02-19 06:40:39,985 - cover_agent.UnitTestValidator - INFO - Using the report coverage feature flag to process the coverage report
2025-02-19 06:40:40,027 - cover_agent.UnitTestValidator - INFO - Total lines covered: 1156, Total lines missed: 6582, Total lines: 7738
2025-02-19 06:40:40,027 - cover_agent.UnitTestValidator - INFO - coverage: Percentage 14.94%
2025-02-19 06:40:40,027 - cover_agent.UnitTestValidator - INFO - Initial coverage: 14.94%
2025-02-19 06:40:40,027 - cover_agent.CoverAgent - INFO - Current Coverage: 14.94%
2025-02-19 06:40:40,027 - cover_agent.CoverAgent - INFO - Desired Coverage: 100%
Printing results from LLM model...

language: ruby
existing_test_function_signature: |
  RSpec.describe LadderStatsWorker, type: :worker do
new_tests:
- test_behavior: |
    Ensure that LadderStatsWorker logs the correct message and enqueues a job to Redis with the proper JSON payload.
  test_name: |
    test_enqueue_job_to_redis
  test_code: |
    # generated with AI by cover-agent (https://github.com/Codium-ai/cover-agent)
    describe '#perform' do
      let(:redis_double) { double('Redis') }
      let(:ladder_id) { 42 }
      let(:uuid) { 'user-uuid-123' }
      let(:stat) { 'exit_count' }
      let(:amount) { 10 }
      
      before do
        $redis_stats = redis_double
        allow(Rails.logger).to receive(:info)
        allow(SecureRandom).to receive(:uuid).and_return('static-generated-uuid')
      end

      it 'logs the correct message and calls Redis sadd with proper JSON payload' do
        expected_payload = { ladder_id: ladder_id, uuid: uuid, stat: stat, amount: amount, identifier: 'static-generated-uuid' }.to_json
        expect(redis_double).to receive(:sadd).with('ladder_stats', expected_payload)
        LadderStatsWorker.new.perform(ladder_id, uuid, stat, amount)
        expect(Rails.logger).to have_received(:info).with("LadderStatsWorker stat: #{ladder_id},#{uuid},#{stat},#{amount}")
      end
    end
  new_imports_code: |
    ""
  test_tags: |
    happy path
- test_behavior: |
    Ensure that LadderStatsWorker operates correctly when given a negative amount, still logging and enqueuing the job.
  test_name: |
    test_with_negative_amount
  test_code: |
    # generated with AI by cover-agent (https://github.com/Codium-ai/cover-agent)
    describe '#perform with negative amount' do
      let(:redis_double) { double('Redis') }
      let(:ladder_id) { 100 }
      let(:uuid) { 'negative-test-uuid' }
      let(:stat) { 'score' }
      let(:amount) { -5 }
      
      before do
        $redis_stats = redis_double
        allow(Rails.logger).to receive(:info)
        allow(SecureRandom).to receive(:uuid).and_return('neg-static-uuid')
      end

      it 'logs the message and enqueues the job to Redis even with negative amount' do
        expected_payload = { ladder_id: ladder_id, uuid: uuid, stat: stat, amount: amount, identifier: 'neg-static-uuid' }.to_json
        expect(redis_double).to receive(:sadd).with('ladder_stats', expected_payload)
        LadderStatsWorker.new.perform(ladder_id, uuid, stat, amount)
        expect(Rails.logger).to have_received(:info).with("LadderStatsWorker stat: #{ladder_id},#{uuid},#{stat},#{amount}")
      end
    end
  new_imports_code: |
    ""
  test_tags: |
    edge case

2025-02-19 06:40:55,401 - cover_agent.UnitTestValidator - INFO - Running test with the following command: "docker compose exec -e RAILS_ENV=test -e SPEC_RUN_OK=true web rspec spec/workers/ladder_stats_worker_spec.rb"
2025-02-19 06:41:12,185 - cover_agent.UnitTestValidator - INFO - Using the report coverage feature flag to process the coverage report
2025-02-19 06:41:12,233 - cover_agent.UnitTestValidator - INFO - Total lines covered: 1247, Total lines missed: 7010, Total lines: 8257
2025-02-19 06:41:12,233 - cover_agent.UnitTestValidator - INFO - coverage: Percentage 15.1%
2025-02-19 06:41:12,234 - cover_agent.UnitTestValidator - ERROR - Error validating test: 'app/models/archived_subscription.rb'
2025-02-19 06:41:12,255 - cover_agent.UnitTestValidator - INFO - Running test with the following command: "docker compose exec -e RAILS_ENV=test -e SPEC_RUN_OK=true web rspec spec/workers/ladder_stats_worker_spec.rb"
2025-02-19 06:41:27,497 - cover_agent.UnitTestValidator - INFO - Using the report coverage feature flag to process the coverage report
2025-02-19 06:41:27,531 - cover_agent.UnitTestValidator - INFO - Total lines covered: 1247, Total lines missed: 7010, Total lines: 8257
2025-02-19 06:41:27,531 - cover_agent.UnitTestValidator - INFO - coverage: Percentage 15.1%
2025-02-19 06:41:27,532 - cover_agent.UnitTestValidator - ERROR - Error validating test: 'app/models/archived_subscription.rb'
2025-02-19 06:41:27,535 - cover_agent.UnitTestValidator - INFO - Running build/test command to generate coverage report: "docker compose exec -e RAILS_ENV=test -e SPEC_RUN_OK=true web rspec spec/workers/ladder_stats_worker_spec.rb"
2025-02-19 06:41:46,306 - cover_agent.UnitTestValidator - INFO - Using the report coverage feature flag to process the coverage report
2025-02-19 06:41:46,340 - cover_agent.UnitTestValidator - INFO - Total lines covered: 1247, Total lines missed: 7010, Total lines: 8257
2025-02-19 06:41:46,340 - cover_agent.UnitTestValidator - INFO - coverage: Percentage 15.1%
2025-02-19 06:41:46,341 - cover_agent.UnitTestValidator - INFO - Initial coverage: 15.1%
2025-02-19 06:41:46,341 - cover_agent.CoverAgent - INFO - Current Coverage: 15.1%
2025-02-19 06:41:46,341 - cover_agent.CoverAgent - INFO - Desired Coverage: 100%
Printing results from LLM model...

language: ruby
existing_test_function_signature: |
  RSpec.describe LadderStatsWorker, type: :worker do
new_tests:
- test_behavior: |
    Tests that perform method logs the correct message using Rails.logger.info.
  test_name: |
    logs_correct_message
  test_code: |
    # generated with AI by cover-agent (https://github.com/Codium-ai/cover-agent)
    describe '#perform logging' do
      let(:ladder_id) { 1 }
      let(:uuid) { 'test-uuid' }
      let(:stat) { 'win_count' }
      let(:amount) { 3 }
      
      it 'logs the correct message' do
        expect(Rails.logger).to receive(:info).with("LadderStatsWorker stat: #{ladder_id},#{uuid},#{stat},#{amount}")
        allow($redis_stats).to receive(:sadd)  # Stub redis to avoid actual call.
        LadderStatsWorker.new.perform(ladder_id, uuid, stat, amount)
      end
    end
  new_imports_code: |
    ""
  test_tags: happy path
- test_behavior: |
    Tests that the perform method enqueues a correctly formatted JSON payload to $redis_stats using sadd.
  test_name: |
    enqueues_correct_json_payload
  test_code: |
    # generated with AI by cover-agent (https://github.com/Codium-ai/cover-agent)
    describe '#perform redis enqueue' do
      let(:ladder_id) { 2 }
      let(:uuid) { 'user-123' }
      let(:stat) { 'exit_count' }
      let(:amount) { 5 }
      let(:fake_identifier) { 'fixed-uuid' }
  
      it 'calls $redis_stats.sadd with the correct JSON payload' do
        allow(SecureRandom).to receive(:uuid).and_return(fake_identifier)
        expected_payload = { ladder_id: ladder_id, uuid: uuid, stat: stat, amount: amount, identifier: fake_identifier }.to_json
        expect($redis_stats).to receive(:sadd).with("ladder_stats", expected_payload)
        LadderStatsWorker.new.perform(ladder_id, uuid, stat, amount)
      end
    end
  new_imports_code: |
    ""
  test_tags: happy path
- test_behavior: |
    Tests that the Sidekiq worker queue option is set to ladder_stats.
  test_name: |
    worker_queue_option_set
  test_code: |
    # generated with AI by cover-agent (https://github.com/Codium-ai/cover-agent)
    describe 'Sidekiq configuration' do
      it 'has the correct queue option set' do
        # Access the sidekiq options via get_sidekiq_options hash.
        expect(LadderStatsWorker.get_sidekiq_options['queue']).to eq('ladder_stats')
      end
    end
  new_imports_code: |
    ""
  test_tags: happy path

2025-02-19 06:42:01,957 - cover_agent.UnitTestValidator - INFO - Running test with the following command: "docker compose exec -e RAILS_ENV=test -e SPEC_RUN_OK=true web rspec spec/workers/ladder_stats_worker_spec.rb"
2025-02-19 06:42:18,687 - cover_agent.UnitTestValidator - INFO - Using the report coverage feature flag to process the coverage report
2025-02-19 06:42:18,711 - cover_agent.UnitTestValidator - INFO - Total lines covered: 1247, Total lines missed: 7010, Total lines: 8257
2025-02-19 06:42:18,711 - cover_agent.UnitTestValidator - INFO - coverage: Percentage 15.1%
2025-02-19 06:42:18,711 - cover_agent.UnitTestValidator - INFO - Test did not increase coverage. Rolling back.
2025-02-19 06:42:18,718 - cover_agent.UnitTestValidator - INFO - Running test with the following command: "docker compose exec -e RAILS_ENV=test -e SPEC_RUN_OK=true web rspec spec/workers/ladder_stats_worker_spec.rb"
2025-02-19 06:42:35,069 - cover_agent.UnitTestValidator - INFO - Using the report coverage feature flag to process the coverage report
2025-02-19 06:42:35,094 - cover_agent.UnitTestValidator - INFO - Total lines covered: 1247, Total lines missed: 7010, Total lines: 8257
2025-02-19 06:42:35,094 - cover_agent.UnitTestValidator - INFO - coverage: Percentage 15.1%
2025-02-19 06:42:35,094 - cover_agent.UnitTestValidator - INFO - Test did not increase coverage. Rolling back.
2025-02-19 06:42:35,099 - cover_agent.UnitTestValidator - INFO - Running test with the following command: "docker compose exec -e RAILS_ENV=test -e SPEC_RUN_OK=true web rspec spec/workers/ladder_stats_worker_spec.rb"
2025-02-19 06:42:51,055 - cover_agent.UnitTestValidator - INFO - Skipping a generated test that failed
Printing results from LLM model...
The test failed because the expected Sidekiq queue option value is a string ("ladder_stats") while the worker sets it as a symbol (:ladder_stats).

Recommended Fixes:
• Update the test expectation to compare with :ladder_stats instead of "ladder_stats".
• Alternatively, modify the worker’s sidekiq_options to set the queue as "ladder_stats" (a string) instead of a symbol.
ERROR:root:Error message summary:
The test failed because the expected Sidekiq queue option value is a string ("ladder_stats") while the worker sets it as a symbol (:ladder_stats).

Recommended Fixes:
• Update the test expectation to compare with :ladder_stats instead of "ladder_stats".
• Alternatively, modify the worker’s sidekiq_options to set the queue as "ladder_stats" (a string) instead of a symbol.
2025-02-19 06:42:54,004 - cover_agent.UnitTestValidator - INFO - Running build/test command to generate coverage report: "docker compose exec -e RAILS_ENV=test -e SPEC_RUN_OK=true web rspec spec/workers/ladder_stats_worker_spec.rb"
2025-02-19 06:43:10,445 - cover_agent.UnitTestValidator - INFO - Using the report coverage feature flag to process the coverage report
2025-02-19 06:43:10,469 - cover_agent.UnitTestValidator - INFO - Total lines covered: 1247, Total lines missed: 7010, Total lines: 8257
2025-02-19 06:43:10,469 - cover_agent.UnitTestValidator - INFO - coverage: Percentage 15.1%
2025-02-19 06:43:10,469 - cover_agent.UnitTestValidator - INFO - Initial coverage: 15.1%
2025-02-19 06:43:10,469 - cover_agent.CoverAgent - INFO - Current Coverage: 15.1%
2025-02-19 06:43:10,469 - cover_agent.CoverAgent - INFO - Desired Coverage: 100%
^CTraceback (most recent call last):
File "", line 1, in
File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/main.py", line 139, in main
agent.run()
File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/CoverAgent.py", line 301, in run
self.run_test_gen(failed_test_runs, language, test_framework, coverage_report)
File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/CoverAgent.py", line 229, in run_test_gen
generated_tests_dict = self.test_gen.generate_tests(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/UnitTestGenerator.py", line 219, in generate_tests
self.agent_completion.generate_tests(
File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/DefaultAgentCompletion.py", line 22, in generate_tests
response, prompt_tokens, completion_tokens = self.caller.call_model(prompt)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/AICaller.py", line 29, in wrapper
return retry_wrapper()
^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/tenacity/init.py", line 336, in wrapped_f
return copy(f, *args, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/tenacity/init.py", line 475, in call
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/tenacity/init.py", line 376, in iter
result = action(retry_state)
^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/tenacity/init.py", line 398, in
self._add_action_func(lambda rs: rs.outcome.result())
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/[email protected]/3.12.8/Frameworks/Python.framework/Versions/3.12/lib/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/[email protected]/3.12.8/Frameworks/Python.framework/Versions/3.12/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/tenacity/init.py", line 478, in call
result = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/AICaller.py", line 27, in retry_wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/CAN/site/qodo-cover/cover_agent/AICaller.py", line 108, in call_model
response = litellm.completion(**completion_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/litellm/utils.py", line 1032, in wrapper
result = original_function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/litellm/main.py", line 1636, in completion
response = openai_chat_completions.completion(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/litellm/llms/openai/openai.py", line 635, in completion
self.make_sync_openai_chat_completion_request(
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/litellm/litellm_core_utils/logging_utils.py", line 145, in sync_wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/litellm/llms/openai/openai.py", line 436, in make_sync_openai_chat_completion_request
raw_response = openai_client.chat.completions.with_raw_response.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/openai/_legacy_response.py", line 364, in wrapped
return cast(LegacyAPIResponse[R], func(*args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/openai/_utils/_utils.py", line 279, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/openai/resources/chat/completions.py", line 863, in create
return self._post(
^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/openai/_base_client.py", line 1283, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/openai/_base_client.py", line 960, in request
return self._request(
^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/openai/_base_client.py", line 996, in _request
response = self._client.send(
^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpx/_client.py", line 926, in send
response = self._send_handling_auth(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpx/_client.py", line 954, in _send_handling_auth
response = self._send_handling_redirects(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpx/_client.py", line 991, in _send_handling_redirects
response = self._send_single_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpx/_client.py", line 1027, in _send_single_request
response = transport.handle_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpx/_transports/default.py", line 236, in handle_request
resp = self._pool.handle_request(req)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
raise exc from None
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
response = connection.handle_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 103, in handle_request
return self._connection.handle_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 136, in handle_request
raise exc
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 106, in handle_request
) = self._receive_response_headers(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 177, in _receive_response_headers
event = self._receive_event(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 217, in _receive_event
data = self._network_stream.read(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonrosenbaum/Library/Caches/pypoetry/virtualenvs/cover-agent-QB1wIuOP-py3.12/lib/python3.12/site-packages/httpcore/_backends/sync.py", line 128, in read
return self._sock.recv(max_bytes)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/[email protected]/3.12.8/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ssl.py", line 1232, in recv
return self.read(buflen)
^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/[email protected]/3.12.8/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ssl.py", line 1105, in read
return self._sslobj.read(len)
^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt

My reading of this log is:

  • it runs the tests to get initial coverage of 14.94%, reading from the top coverage number of the coverage xml file, so far so expected.
  • it gets some tests from the llm and tries them
  • it immediately finds a working test that increase the overall coverage, the percentage increases to 15.1% and the total lines covered increases too
  • noting that it's not calling out the source file that increased specifically, but I'm watching coverage.xml and the source file is now 100% covered after 1 test
  • it keeps that one test it found that increased coverage (good)
  • it proceeds to test the other test it got from the LLM. The 2nd also increases coverage (though seemingly in the same way as the first one, no lines covered or percent changes), but the test is kept for some reason.
  • it now runs the entire test file to check overall coverage, finds 15.1%, which is true, but the key file is now 100% covered. It doesn't see that, so it keeps going.
  • it gets another round of tests from the LLM, most of which don't increase coverage overall and are thrown out, one of which errors
  • it keeps going for a third round (this is where I manually cut it off)

The coverage.xml file at the end looks like this:

https://drive.google.com/file/d/1fchHl53a9_3FxEBmkVXOC_gsYOzTAjOD/view?usp=sharing

Where you can see full coverage of my target file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fix Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants