Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: allow package-loading prior to lint hook #440

Open
russHyde opened this issue Aug 12, 2022 · 18 comments
Open

Feature: allow package-loading prior to lint hook #440

russHyde opened this issue Aug 12, 2022 · 18 comments

Comments

@russHyde
Copy link

russHyde commented Aug 12, 2022

Is your feature request related to a problem? Please describe.
When running the lintr hook on a package, it flags many object_usage_linter errors.
This happens because, unless the package is loaded, lintr is unable to distinguish when a referenced object is completely undefined from when it is defined within the package.
Typically, we would lint the whole package in CI but use pre-commit to only lint those files in a commit set.
We could turn off the object_usage_linter in the .lintr config, but that config affects both precommit and CI runs of lintr, so would mean that this useful linter would be unavailable in CI.

Describe the solution you'd like
Would it be possible to add a command line option to the pre-commit linting script that determines whether the package in a repo should be loaded prior to running lintr. For example, Rscript lintr.R <files> would run as currently implemented. But Rscript lintr.R --load-package <files> would run pkgload::load_all() before running lintr on the files.

@lorenzwalthert
Copy link
Owner

Thanks @russHyde. What about the option to lint the whole package instead of the staged files? Is that also an option? I prefer to not use pkgload::load_all() because that requires all dependencies of the package under linting to be in the same {renv} where {precommit} is installed...

@russHyde
Copy link
Author

That's interesting, I hadn't considered the environment where precommit/lintr is running. This might be something better handled on our (lintr's) end - for example, having a way to selectively deactivate linters based on environment variables, so that some linters are only active in CI.

@lorenzwalthert
Copy link
Owner

Is there smart caching built in in {lintr}, or is running it on all files significantly slower?

@lorenzwalthert
Copy link
Owner

lorenzwalthert commented Aug 13, 2022

any thoughts @MichaelChirico?

@MichaelChirico
Copy link

lintr is typically taking O(10 seconds) on full packages, but yes there is an expression-level caching mechanism

@lorenzwalthert
Copy link
Owner

having a way to selectively deactivate linters based on environment variables, so that some linters are only active in CI.

I think this sounds like introducing intransparency. Another option that we considered in #393 is to use language: script and run the hook in the global R environment. However, this also means that the check can't be run in pre-commit.ci (which is fine for that specific hook, but not for lintr hooks). Alternatively, we could have a hook running locally (and skipped on ci) and one running only on CI (but skipped locally). However, these are all workarounds for me.

lintr is typically taking O(10 seconds) on full packages, but yes there is an expression-level caching mechanism

If you cache files, would it get significantly faster? Then that would be the solution in my eyes (offloading my own work to others has always been my easy way out 😄).

@lorenzwalthert
Copy link
Owner

Also note that theoretically, {lintr} does not have to offer the functionality to deactivate linters based on environment variables, even if we wanted to use env variables (or pre-commit arguments in .pre-commit-config.yaml). Because in our hook script, we could also load the {lintr} config, modify it and then call {lintr} with passing the modified contents explicitly as config (I guess this should work). Or just make behaviour different for local vs cloud execution (but language is fix), but that also is not very transparent to me.

@russHyde
Copy link
Author

I agree that the env-variables make it less transparent. I might experiment with writing a lintr hook that allows the user to specify a lintr config, so that I can have a .lintr.pre-commit (for use in local pre-commit) and a .lintr config (for use interactively and in CI)

@lorenzwalthert
Copy link
Owner

Yeah ok. Here's how I'd do it.

  • For local, you could also use language: script to use the global R library (so forget about additional_dependencies:). devtool::load_all() and then lint only files to be committed.
  • For remote, use language: r that uses another config file and runs lintr on the whole package. Since it's impossible to skip hooks locally and only run remotely IIRC, you could at the top of this script check if CI env variable is set (I believe it's set in pre-commit.ci) and skip the remainder.

@joelnitta
Copy link

joelnitta commented Nov 16, 2022

For local, you could also use language: script to use the global R library (so forget about additional_dependencies:). devtool::load_all() and then lint only files to be committed.

@lorenzwalthert I'm not familiar enough with the setup of .pre-commit-config.yaml to figure out how to apply the solution you propose. Do you mind clarifying that a bit? Below is my current .pre-commit-config.yaml; how should I change it to use the global R library as you suggest? (I'm only concerned with doing this locally, not on remote)

# All available hooks: https://pre-commit.com/hooks.html
# R specific hooks: https://github.com/lorenzwalthert/precommit
repos:
-   repo: https://github.com/lorenzwalthert/precommit
    rev: v0.3.2.9003
    hooks: 
    -   id: style-files
        args: [--style_pkg=styler, --style_fun=tidyverse_style]    
    -   id: roxygenize
        # roxygen requires loading pkg -> add dependencies from DESCRIPTION
        additional_dependencies:
        -    assertr
        -    assertthat
        -    digest
        -    dplyr
        -    glue
        -    methods
        -    purrr
        -    stringr
        -    tibble
        -    tidyr
        -    utils
        -    anthonynorth/roxyglobals
    # codemeta must be above use-tidy-description when both are used
    -   id: codemeta-description-updated
    -   id: use-tidy-description
    -   id: spell-check
        exclude: >
          (?x)^(
          .*\.[rR]|
          .*\.feather|
          .*\.jpeg|
          .*\.pdf|
          .*\.png|
          .*\.py|
          .*\.RData|
          .*\.rds|
          .*\.Rds|
          .*\.Rproj|
          .*\.sh|
          (.*/|)\.gitignore|
          (.*/|)\.gitlab-ci\.yml|
          (.*/|)\.lintr|
          (.*/|)\.pre-commit-.*|
          (.*/|)\.Rbuildignore|
          (.*/|)\.Renviron|
          (.*/|)\.Rprofile|
          (.*/|)\.travis\.yml|
          (.*/|)appveyor\.yml|
          (.*/|)NAMESPACE|
          (.*/|)renv/settings\.dcf|
          (.*/|)renv\.lock|
          (.*/|)WORDLIST|
          \.github/workflows/.*|
          data/.*|
          )$
    -   id: lintr
    -   id: readme-rmd-rendered
    -   id: parsable-R
    -   id: no-browser-statement
    -   id: no-debug-statement
    -   id: deps-in-desc
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.3.0
    hooks: 
    -   id: check-added-large-files
        args: ['--maxkb=200']
    -   id: file-contents-sorter
        files: '^\.Rbuildignore$'
    -   id: end-of-file-fixer
        exclude: >
          (?x)^(
          .*\.Rd|
          tests/testthat/_snaps/.*\.md|
          )$
-   repo: https://github.com/pre-commit-ci/pre-commit-ci-config
    rev: v1.5.1
    hooks:
    # Only reuiqred when https://pre-commit.ci is used for config validation
    -   id: check-pre-commit-ci-config
-   repo: local
    hooks:
    -   id: forbid-to-commit
        name: Don't commit common R artifacts
        entry: Cannot commit .Rhistory, .RData, .Rds or .rds.
        language: fail
        files: '\.(Rhistory|RData|Rds|rds)$'
        # `exclude: <regex>` to allow committing specific files

ci:
    autoupdate_schedule: monthly

@lorenzwalthert
Copy link
Owner

lorenzwalthert commented Nov 16, 2022

Like this:

-   repo: local
    hooks:
    -   id: local-lintr
        name: Run lintr with global R environment
        entry: Rscript -e "devtools::load_all(); output <- lintr::lint(commandArgs(trailingOnly=TRUE)); print(output); if (length(output) > 0) stop('Files not lint free')"
        language: system
        files: '(\.[rR]profile|\.R|\.Rmd|\.Rnw|\.r|\.rmd|\.rnw)$'
  

You could also move the R code in entry to a file and use language: script with entry: path/to/script.R where the script must be executable by the user.

@lorenzwalthert
Copy link
Owner

Pre-commit passes the file names to entry as unnamed arguments.

@joelnitta
Copy link

@lorenzwalthert thanks!

However, I noticed this only works if a single R file is being committed, since lintr::lint() only works on a single file.

Below I have modified it to work on one or more R files:

-   repo: local
    hooks:
    -   id: local-lintr
        name: Run lintr with global R environment
        entry: > 
          Rscript -e "devtools::load_all();
          output <- commandArgs(trailingOnly=TRUE) |> strsplit(' ') |>
          unlist() |> lapply(lintr::lint) |> purrr::compact();
          if (length(output) > 0) {print(output); stop('Files not lint free')}"
        language: system
        files: '(\.[rR]profile|\.R|\.Rmd|\.Rnw|\.r|\.rmd|\.rnw)$'

@joelnitta
Copy link

(BTW I would be interested in running this with CI also if you could provide more details on that as well)

@lorenzwalthert
Copy link
Owner

Thanks for your snipped. Also, you may want to print lilntr::lint() output, so in case the hook does not pass, the problems are displayed.

For CI, this does not work (system hooks are not supported with pre-commit.ci). Potentially later with GitHub Actions and #450, but for now, maybe you could just use the lintr action from r-lib/actions?

@joelnitta
Copy link

It does print the output (I just ran a test to verify): {print(output); stop('Files not lint free')}

So for CI, there is no need to modify .pre-commit-config.yaml right? The CI will just ignore - repo: local?

@lorenzwalthert
Copy link
Owner

Not sure it skips the local hook by default, you could also skip it explicitly as mentioned in the pre-commit.ci docs:

ci:
    skip: [pylint]

repos:
-   repo: local
    hooks:
    -   id: pylint
        # ...

@lorenzwalthert
Copy link
Owner

Great you achieved what you wanted. A solution to make this work withouth local hook would still be to

  • add a flag --load-package and call pkgload::load_all() if the flag is set (no need for it to be installed in the renv if the flag is not set
  • and to require the user to list all dependencies plus {pkgload} in additional_dependencies.

Or to use language: system for the hook and let useres use pre-commit CI lite to manage dependencies (instead of renv).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants