Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glob patterns should not be overriden by specific --per-file-ignores rules #670

Open
asottile opened this issue Apr 3, 2021 · 18 comments
Open

Comments

@asottile
Copy link
Member

asottile commented Apr 3, 2021

In GitLab by @qxv0 on Jan 30, 2019, 06:59

flake8 --bug-report:

{
  "dependencies": [
    {
      "dependency": "entrypoints",
      "version": "0.3"
    }
  ],
  "platform": {
    "python_implementation": "CPython",
    "python_version": "3.7.2",
    "system": "Windows"
  },
  "plugins": [
    {
      "is_local": false,
      "plugin": "ProxyChecker",
      "version": "0.0.1"
    },
    {
      "is_local": false,
      "plugin": "flake8-comprehensions",
      "version": "1.4.1"
    },
    {
      "is_local": false,
      "plugin": "flake8-docstrings",
      "version": "1.3.0, pydocstyle: 3.0.0"
    },
    {
      "is_local": false,
      "plugin": "flake8-future-import",
      "version": "0.4.5"
    },
    {
      "is_local": false,
      "plugin": "flake8-mock",
      "version": "0.3"
    },
    {
      "is_local": false,
      "plugin": "flake8-no-u-prefixed-strings",
      "version": "0.2"
    },
    {
      "is_local": false,
      "plugin": "flake8-print",
      "version": "3.1.0"
    },
    {
      "is_local": false,
      "plugin": "flake8-string-format",
      "version": "0.2.3"
    },
    {
      "is_local": false,
      "plugin": "flake8-tuple",
      "version": "0.2.13"
    },
    {
      "is_local": false,
      "plugin": "flake8_coding",
      "version": "1.3.1"
    },
    {
      "is_local": false,
      "plugin": "flake8_quotes",
      "version": "1.0.0"
    },
    {
      "is_local": false,
      "plugin": "hacking.core",
      "version": "0.0.1"
    },
    {
      "is_local": false,
      "plugin": "mccabe",
      "version": "0.6.1"
    },
    {
      "is_local": false,
      "plugin": "naming",
      "version": "0.8.1"
    },
    {
      "is_local": false,
      "plugin": "pycodestyle",
      "version": "2.5.0"
    },
    {
      "is_local": false,
      "plugin": "pyflakes",
      "version": "2.1.0"
    }
  ],
  "version": "3.7.1"
}

Say we want to ignore D102 errors in all files within the tests directory. We can add a rule for each file individually, but it would be more efficient if we could do that just by adding tests/* : N813 to the config file (similar to the way flake8-per-file-ignores plugin works on flake8<3.7.0). see the comments below

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @asottile on Jan 30, 2019, 07:30

It already works for me (?) do you have a particular repository this isn't working in?

$ git diff
diff --git a/tests/unit/test_git.py b/tests/unit/test_git.py
index 24d4b5b..b78f9fb 100644
--- a/tests/unit/test_git.py
+++ b/tests/unit/test_git.py
@@ -1,4 +1,5 @@
 """Tests around functionality in the git integration."""
+import os
 import mock
 import pytest
 
diff --git a/tox.ini b/tox.ini
index 5d94e5c..111f704 100644
--- a/tox.ini
+++ b/tox.ini
@@ -172,3 +172,5 @@ max-complexity = 10
 import-order-style = google
 application-import-names = flake8
 format = ${cyan}%(path)s${reset}:${yellow_bold}%(row)d${reset}:${green_bold}%(col)d${reset}: ${red_bold}%(code)s${reset} %(text)s
+per-file-ignores:
+    tests/*: F401
$ flake8 tests
$ flake8  --version
3.7.0 (mccabe: 0.6.1, pycodestyle: 2.5.0, pyflakes: 2.1.0) CPython 3.6.7 on Linux

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @qxv0 on Jan 30, 2019, 08:05

I checked and it seems to me that the problem occurs only in those files which also have another specific ignore rule.

For example if I have both tests/*: X1 and tests/a.py:Y2, then only Y2 errors are ignored on tests/a.py, but X1 errors show up which is unexpected.

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @qxv0 on Jan 30, 2019, 08:08

changed title from {-Allow glob patterns in paths of --per-file-ignor-}es to {+glob patterns should not be overriden by specific --per-file-ignores rul+}es

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @qxv0 on Jan 30, 2019, 08:08

changed the description

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @asottile on Jan 30, 2019, 08:16

hmmm so the way it's implemented is the first rule that matches a filename is used. In your case you can write:

per-file-ignores =
    tests/a.py: Y2, X1
    tests/*: X1

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @oseiberts11 on Feb 6, 2019, 02:12

I seem to have a similar problem (3.7.5). In my per-file-ignores I have (among other things but in this order):

per-file-ignores =
    machinedb/**/*.py: D101, D102, D103
    machinedb/models/inventory.py: W503

First of all, this particular glob pattern doesn't seem to work any more.

Even if I expand this to

    machinedb/*:D101,D102,D103
    machinedb/*/*:D101,D102,D103
    machinedb/models/sshca.py: W503

then I still get D* errors for sshca.py:

machinedb/models/sshca.py:108:1: D102 Missing docstring in public method

Copying D101, D102, D103 to all the more specific patterns is of course not very nice.
Additionally the glob machinedb/**/*.py is no longer working.

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @sigmavirus24 on Feb 6, 2019, 04:57

@oseiberts11 so to be clear, it sounds like what you expect to happen is that

machinedb/**/*.py: D101, D102, D103

To ignore those three errors in every file matching that glob and then you seem to want to extend the list of ignores for those files for specific files, e.g., machinedb/models/inventory with the errors specified for each of those files. Am I understanding you correctly?

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @oseiberts11 on Feb 6, 2019, 07:26

Exactly. That is how it seemed to work with the plugin. (I didn't write the configuration myself, I just noticed that behaviour seems to have changed).

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @sigmavirus24 on Feb 6, 2019, 17:26

Right, the plugin took a different tact, honestly, that would be harder to replicate here. Basically, it iterated through everything for matches and applied all of them. We could do that, but then we're favoring exact backwards compatibility (which wasn't ever the goal) over simplicity. It would also mean that we're iterative over a list every time rather than thinking through things more simply, and it makes debugging the option harder in the future, i.e., we always prefer the most specific rule there even if multiple might match.

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @oseiberts11 on Feb 7, 2019, 01:11

I suspected something like that.

I see a potential solution for people who do want exact backwards compatibility. Maybe you could make it so that if the flake8-per-file-ignores plugin is selected, then the built-in functionality is disabled, and the plugin is used instead. Probably with a warning of some sort which points out that the plugin has mostly but not exactly the same functionality. Then people who are upgrading automatically and unknowingly depend on some old behaviour (I was in that situation) are not affected by any change. When they are actively changing the config for the new situation, they will notice the change, but that is a much better time for it because they are actively fixing things anyway.

I realise that this is pretty much the opposite of how the plugin is handled now, but does that sound doable?

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @sigmavirus24 on Feb 9, 2019, 05:46

The plugin's page says it's deprecated, I'd rather not encourage folks to use bit-rotting software. Is it too arduous to

    machinedb/**/*.py: D101, D102, D103
    machinedb/models/inventory.py: W503, D101, D102, D103

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @oseiberts11 on Feb 15, 2019, 02:57

For this example with only 2 lines, that would work well enough. But in practice I have several patterns, and several files with additional ignores.

From what I've seen above, I understand that the way you want it, is to select a single line, without scanning several or all of them. I fear that in the general case this isn't possible. If there are several wildard patterns that match a particular file name, which one are you going to choose? Why?

And in the general case, you cannot just point to "the best pattern" without examining several of them. To give a simple example, with patterns like

a/b/c/*
a/b/*/d
a/*/c/d
*/b/c/d
a/b/*/*
a/*/c/*
*/*/*/d

and file names like a/b/c/d, a/x/c/d, a/b/x/d, the names will match multiple patterns, and/or they will have different "best" patterns. You cannot (in an obviously understandable way) say ahead of time which pattern is "best". Even if you manage to create a total ordering, for many potential file names you will need to try several patterns anyway until you find a match.

Since trying to explain to a user how this works is too complicated (and I'm not even talking about implementing it), just scanning all patterns and applying all matches is much easer to understand, and much easier to implement.

(There seems to be a bug in the implementation already, viz. that the machinedb/**/*.py pattern didn't seem to match any files at all for me. So I'm pessimistic about more complex cases.)

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @asottile on Feb 15, 2019, 03:48

I don't think it's hard to explain, the first matching pattern wins right?

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @asottile on Feb 15, 2019, 03:57

Also regarding the ** glob -- that's because flake8 uses fnmatch globs and not glob globs, so you'd want just machinedb/*.py

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @sigmavirus24 on Feb 15, 2019, 04:15

It's the last match sorted alphabetically

https://gitlab.com/pycqa/flake8/blob/dc7b082b96751c8ef7bfc616f8701d432afa7ef7/src/flake8/style_guide.py#L369-374

So it's always consistent and reproducible.

If we wanted to throw people under the bus and give them a foot-gun at the same time, we could pretty easily just apply every thing, but then things they didn't intend to match end up hiding problems from them they didn't want. This is precisely why I had no original intention to support globbing. It introduces far too much unnecessary complexity and no one will be happy no matter how it's implemented.

@asottile
Copy link
Member Author

asottile commented Apr 3, 2021

In GitLab by @bob.hyman on May 2, 2020, 16:34

First let's document this, to help save the next developer a day lost to learning what to expect. Here's a draft, please comment:

(In --per-file-ignores):

If multiple patterns match a given file, a single pattern is chosen and the ignores from just that rule are applied to the file. The pattern chosen is the alphabetically last pattern which matches the file. This is often the most specific pattern that matched.

Is this correct?

ansible-zuul bot pushed a commit to ansible/ansible-navigator that referenced this issue Jan 25, 2022
…as reported with `lint-vetting` (#792)

Disable warnings related to the use of assert in the tests directory as reported with `lint-vetting`

This adds a few additional files to the per-file-ignore list within the flake8 configuration with an exemption for S101 which is reported by flake8-bandit as part of the wemake style guide which is being vetted.
The list was also alphabetized to ensure new entries were placed in a sane location.
A note was added to the top of the flake8 per-file-ignores indicating that the glob for the tests folder ignoring s101 can be added later. Currently there is a conflict between globs and per-file entries.  See PyCQA/flake8#670
This eliminates the S101 entry from the output of tox -e lint-vetting so users of it can see messages only related to the changes they are making and not for files that were previously added to the repository:
1     C812 missing trailing comma
1     RST304 Unknown interpreted text role "mod".
- 13    S101 Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
1     S404 Consider possible security implications associated with the subprocess module.
1     S603 subprocess call - check for execution of untrusted input.
2     S604 Function call with shell=True parameter identified, possible security issue.
2     WPS111 Found too short name: p < 2
2     WPS115 Found upper-case constant in a class: KEGEX
3     WPS201 Found module with too many imports: 13 > 12
6     WPS210 Found too many local variables: 6 > 5
3     WPS220 Found too deep nesting: 24 > 20
2     WPS221 Found line with high Jones Complexity: 16 > 14
3     WPS226 Found string literal over-use: test_option > 3
9     WPS305 Found `f` string
1     WPS316 Found context manager with too many assignments
6     WPS317 Found incorrect multi-line parameters
2     WPS323 Found `%` string formatting
4     WPS324 Found inconsistent `return` statement
2     WPS414 Found incorrect unpacking target
1     WPS430 Found nested function: getcwd
14    WPS436 Found protected module import: _curses
3     WPS450 Found protected object import: _CursesWindow
1     WPS602 Found using `@staticmethod`
Related #713

Reviewed-by: Sviatoslav Sydorenko <webknjaz+github/[email protected]>
Reviewed-by: Bradley A. Thornton <[email protected]>
Reviewed-by: None <None>
ssbarnea pushed a commit to ssbarnea/ansible-navigator that referenced this issue Feb 12, 2022
…as reported with `lint-vetting` (ansible#792)

Disable warnings related to the use of assert in the tests directory as reported with `lint-vetting`

This adds a few additional files to the per-file-ignore list within the flake8 configuration with an exemption for S101 which is reported by flake8-bandit as part of the wemake style guide which is being vetted.
The list was also alphabetized to ensure new entries were placed in a sane location.
A note was added to the top of the flake8 per-file-ignores indicating that the glob for the tests folder ignoring s101 can be added later. Currently there is a conflict between globs and per-file entries.  See PyCQA/flake8#670
This eliminates the S101 entry from the output of tox -e lint-vetting so users of it can see messages only related to the changes they are making and not for files that were previously added to the repository:
1     C812 missing trailing comma
1     RST304 Unknown interpreted text role "mod".
- 13    S101 Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
1     S404 Consider possible security implications associated with the subprocess module.
1     S603 subprocess call - check for execution of untrusted input.
2     S604 Function call with shell=True parameter identified, possible security issue.
2     WPS111 Found too short name: p < 2
2     WPS115 Found upper-case constant in a class: KEGEX
3     WPS201 Found module with too many imports: 13 > 12
6     WPS210 Found too many local variables: 6 > 5
3     WPS220 Found too deep nesting: 24 > 20
2     WPS221 Found line with high Jones Complexity: 16 > 14
3     WPS226 Found string literal over-use: test_option > 3
9     WPS305 Found `f` string
1     WPS316 Found context manager with too many assignments
6     WPS317 Found incorrect multi-line parameters
2     WPS323 Found `%` string formatting
4     WPS324 Found inconsistent `return` statement
2     WPS414 Found incorrect unpacking target
1     WPS430 Found nested function: getcwd
14    WPS436 Found protected module import: _curses
3     WPS450 Found protected object import: _CursesWindow
1     WPS602 Found using `@staticmethod`
Related ansible#713

Reviewed-by: Sviatoslav Sydorenko <webknjaz+github/[email protected]>
Reviewed-by: Bradley A. Thornton <[email protected]>
Reviewed-by: None <None>
@ThiefMaster
Copy link

I'd really like to see merging of the excludes from multiple patterns that match.

Currently I have this:

per-file-ignores =
    # allow nicely aligned parametrizations
    indico/*_test.py:E241
    # models with hybrids
    indico/*/models/*.py:N805

so in any unit test files I want to disable E241, and in any models I want to disable N805 (due to sqlalchemy hybrid property expression classmethods). Guess where one of the tests that need E241 disabled is...

@stdedos
Copy link
Contributor

stdedos commented Aug 23, 2023

Quoting from #670 (comment):

(In --per-file-ignores):

If multiple patterns match a given file, a single pattern is chosen and the ignores from just that rule are applied to the file. The pattern chosen is the alphabetically last pattern which matches the file. This is often the most specific pattern that matched.

I'd also like to see this #670 (comment) too in the documentation:

Also regarding the ** glob -- that's because flake8 uses fnmatch globs and not glob globs, so you'd want just machinedb/*.py

I didn't loose a day, but a good two hours for sure. Especially given that the first result in per-file-ignores is https://pypi.org/project/flake8-per-file-ignores/ (which paints a different image).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants