Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add make job to refresh integration tests #1728

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fmdkdd
Copy link
Member

@fmdkdd fmdkdd commented Apr 4, 2020

This is a proof of concept. Running make integ-regen will run all the integration tests, but instead of checking flycheck-current-errors against the expected form passed to flycheck-ert-should-syntax-check, it replaces that form with the -current-errors in flycheck-test.el directly.

That way, a subsequent make integ will pass all tests. This greatly simplifies the workflow of updating the integration tests expected output: just run make integ-regen after a CI failure, check the git diff for anything suspicious, then commit and push.

The current PoC is very hackish and probably inefficient, but it's a good starting point for discussion. The two main ingredients are:

  • make integ-regen adds a load to setup-regen.el so that we can redefine -ert-should-syntax-check to call -ert-regen-errors instead of -ert-should-errors (and skip the length check of errors)
  • -ert-regen-errors locates the expected errors in flycheck-test.el (using a ghastly regexp) and replaces that text with -current-errors turned into a string

First: is this sound? And we do want to go that route?

(Note that I have only tested it on Rust tests, so it may well fail on other tests that construct errors a bit differently)

@cpitclaudel
Copy link
Member

That's a very good step. I think it's reasonable, though I'd been thinking of something slightly different. Basically, I think things would work better if we separated out the paring tests. Basically, we'd have two things: checker inputs and configuration (local variables, etc), and checker output and expected errors.

Concretely, for the following sample:

main() {
    return false;
}

We'd save the checker output into a file:

<stdin>:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
<stdin>: In function ‘main’:
<stdin>:2:12: error: ‘false’ undeclared (first use in this function)
<stdin>:2:12: note: each undeclared identifier is reported only once for each function it appears in

As well as the expected errors:

(#s(flycheck-error #<buffer tmp.c> c/c++-gcc "/home/clement/.emacs.d/lisp/flycheck/tmp.c" 1 1 "return type defaults to ‘int’ " warning "-Wimplicit-int" nil nil nil)
 #s(flycheck-error #<buffer tmp.c> c/c++-gcc "/home/clement/.emacs.d/lisp/flycheck/tmp.c" 2 12 "‘false’ undeclared (first use in this function)" error nil nil nil nil)
 #s(flycheck-error #<buffer tmp.c> c/c++-gcc "/home/clement/.emacs.d/lisp/flycheck/tmp.c" 2 12 "each undeclared identifier is reported only once for each function it appears in" info nil nil nil nil))

Checking that the checkers parse errors properly is then just a matter of calling flycheck-parse-with-patterns on the saved output (with a bit of post-processing of the errors to remove the buffers and strip the file names), and regeneration is just a matter of saving the output of the checker.

For configuration and checker parameters we can use buffer-local or dir-local variables, and with that scheme we can even run tests in parallel.

WDYT?

@fmdkdd
Copy link
Member Author

fmdkdd commented Apr 6, 2020

I think splitting the integration tests in two phases coud be a better thing in the long run, yes. Somehow it may also make regenerating the checker output more straightforward, since that could be made into a separate job/script.

Though I do like that our integration tests run through the same gears as a user-triggered check would. That's the point behind integration tests after all. If there is a way to maintain this guarantee, I'd sleep more soundly.

@cpitclaudel
Copy link
Member

Somehow it may also make regenerating the checker output more straightforward, since that could be made into a separate job/script.

Yup, that's the hope.

Though I do like that our integration tests run through the same gears as a user-triggered check would.

That's fair. With the scheme I propose, the way to do this would be to run a mock binary instead of the real one, which would return the recorded output instead of calling the checker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants