Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the files option #7496

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

DanielNoord
Copy link
Collaborator

@DanielNoord DanielNoord commented Sep 19, 2022

  • Write a good description on what the PR does.
  • Create a news fragment with towncrier create <IssueNumber>.<type> which will be
    included in the changelog. <type> can be one of: new_check, removed_check, extension,
    false_positive, false_negative, bugfix, other, internal. If necessary you can write
    details or offer examples on how the new change is supposed to work.
  • If you used multiple emails or multiple names when contributing, add your mails
    and preferred name in script/.contributors_aliases.json

Type of Changes

Type
✨ New feature

Description

EDIT: Doesn't close ... #5701, see discussion down below.

The change itself turned out to be very minimal. We probably need some testing of this though as I might have disregarded edge-cases. A beta for 2.16 would be good.

@DanielNoord DanielNoord added the Enhancement ✨ Improvement to a component label Sep 19, 2022
@DanielNoord DanielNoord added this to the 2.16.0 milestone Sep 19, 2022
@coveralls
Copy link

coveralls commented Sep 19, 2022

Pull Request Test Coverage Report for Build 3787353582

  • 6 of 6 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.0002%) to 95.447%

Totals Coverage Status
Change from base Build 3787323093: 0.0002%
Covered Lines: 17673
Relevant Lines: 18516

💛 - Coveralls

@github-actions

This comment has been minimized.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this, we're adding complexity and yet another option on top of something that should be simple with no option. We have to think of ease of use for new users too

@DanielNoord
Copy link
Collaborator Author

Hm, I don't think this adds too much complexity? Other tools work similarly. For example, mypy and pytest.

For normal users nothing changes as they can keep using the CLI like they used to. We're just exposing the internal aggregation of all left over arguments into a files option to the users to set directly.

@Pierre-Sassoulas
Copy link
Member

We saw when migrating the configuration to pyproject.toml that we have a lot more verbosity than other tool. Being configurable is a strength. But sometime there's a clear choice that we need to make to make the tool better especially when we choose what the default should be. The result of the vote on #5701 is currently more than 5 against one in favor of having a default value that permit to do pylint without adding a . (and probably also without specifying a new option in the pylintrc which is harder than adding a dot).

@DanielNoord
Copy link
Collaborator Author

DanielNoord commented Sep 22, 2022

I understand that the vote is in favour of the other option, but in this case I think as maintainers we should make a different choice than our users:

  1. This is the only way to do this in a backwards compatible way
  2. Other tools in de Python ecosystem behave similarly
  3. It offers additional benefits over making pylint > pylint .. You can easily exclude documentation folders, your setup.py and other configuration files. If we automatically lint the source directory you would need to add additional ignores to counteract that.
  4. Edit and added later: See Should pylint be equivalent to pylint . ? #5701 (comment). We also don't need to worry about excluding certain directories. That's just up to the user and their files option.

The option (imo) is not new, it was only unexposed and for the cost of having to set it once users can get all the benefits of the original proposal but sooner and without having to add additional ignores.

@Pierre-Sassoulas
Copy link
Member

This is the only way to do this in a backwards compatible way

There's breaking change and "breaking change". Changing the default value for pylint only remove the help message when calling pylint. This is not going to break millions of pipelines. It's a usability issue. We're doing it as a breaking change in 3.0 because we're doing things properly but in reality it's not that much of an issue.

Other tools in de Python ecosystem behave similarly

Some other use the current directory as a sensible default, this is an argument we had in #5701 and the issue was here to settle it. Beside "some tool does it" is not an argument in itself, what's the underlying reason to do it ? I think if a sensible default is possible then the sensible default should be used.

It offers additional benefits over making pylint > pylint .. You can easily exclude documentation folders, your setup.py and other configuration files. If we automatically lint the source directory you would need to add additional ignores to counteract that.

There's a million way to do that if you're the kind of person that have a pylintrc and that actually read the doc to find this option. (pre-commit exclude, not using the recursive option, find piped in pylint, git ls specifying each files, using ignore/ignore pattern... etc). We could even add the files option to do it as long as it's not required so that pylint lint everything recursively. But the default value need to be sensible, having a wall of text covering multiple screens when you launch the software for the first time and everyone copy pasting pylint $(git ls-files '*.py') or find . -type f -name "*.py" | xargs pylint everywhere is not a good user experience at all, and users actually agree with this 5 to 1.

Edit and added later: See
#5701 (comment). We also don't need to worry about excluding certain directories. That's just up to the user and their files option.

#6471 was fixed, it's really easy to ignore venvs now.

@DanielNoord
Copy link
Collaborator Author

DanielNoord commented Sep 23, 2022

There's breaking change and "breaking change". Changing the default value for pylint only remove the help message when calling pylint. This is not going to break millions of pipelines. It's a usability issue. We're doing it as a breaking change in 3.0 because we're doing things properly but in reality it's not that much of an issue.

It will break any documentation/guide that says you can use pylint to get all options. We have always considered documentation in our backwards incompatible changes as well.

Some other use the current directory as a sensible default, this is an argument we had in #5701 and the issue was here to settle it. Beside "some tool does it" is not an argument in itself, what's the underlying reason to do it ? I think if a sensible default is possible then the sensible default should be used.

Yeah, but that issue has given almost no arguments for doing so in response to my arguments against it. The only one who actually engaged with me showed how easy it would be to use this proposal. Another comment showed how the user is linting 'backend/*.py' which won't be solved by linting . but could be solved by files=. And the other comments are about other issues or ways around current behaviour.

Nobody has responded to my argument that pylint shouldn't try to guess which files it should and should not lint and instead expose that internal decision very easily to its users. (Imo) the size of this PR shows that we don't add any new complexity: the complexity is already there but we didn't allow the user to interact with it. If anything it might also be easier to debug #7003 and similar issues as we would have an actual config option to debug, instead of a list of strings that is getting passed around by various functions.

But the default value need to be sensible

And I believe it is not sensible to have pylint guess which files it should and should not lint. For example, I don't think linting setup.py or doc/conf.py is sensible, but I also don't think we should exclude them by default. The tool shouldn't try and make such decisions, that's up to the user.

Some other use the current directory as a sensible default,

Which tools do? I checked in our own pre-commit and autoflake, pyupgrade, isort, black, flake8, rstcheck, mypy and pydocstringformatter all require a files option. There is no tool in our pre-commit that doesn't.
Why would we break with that pattern? They probably had similar reasons to have this be their default behaviour.

#6471 was fixed, it's really easy to ignore venvs now.

Yeah, but my argument is that doing 1) pylint + having to set your ignores or 2) setting files has the same effect in terms of effort for the user. Making us default to the current directory doesn't solve anything for that use case.

@Pierre-Sassoulas
Copy link
Member

  1. pylint + having to set your ignores or 2) setting files has the same effect in terms of effort for the user. Making us default to the current directory doesn't solve anything for that use case.

I'm not against having a files option, it would be a good addition. What I'm against not setting the default value to current directory in 3.0 and reverting a vote with a clear result done over a long period of time and 40+ vote.

it will break any documentation/guide that says you can use pylint to get all options.

Using --help or -h is pretty ubiquitous in every CLI tool under the sun, this is not such a problematic breaking change.

Yeah, but that issue has given almost no arguments for doing so in response to my arguments against it.

You can't expect everyone to argue about it, they took the time to vote though. I did not want to argue about it for hours which is why I opened #5701 in the first place. I don't think what we should go against the result when what people want is that clear. (even if I wasn't strongly in favor of defaulting to the current directory myself).

If you want I can answer in the original issue for more visibility but:

  1. It's not that large a breaking change, we're talking about the help being displayed. A massive help, nothing like the small "let's get you started" help of other tool:
Usage: black [OPTIONS] SRC ...

One of 'SRC' or 'code' is required.
  1. We can make the assumption that the user want to lint the current directory without being too wrong (as per popular demand)
  2. Settings the "files" option or . --recursive y is not done once, it's done for every invocation of pylint in a project or directory that is not already configured, every time someone want to use pylint somewhere else than the root directory on project that are already configured (until we release the MR searching for configuration file in the root directory).

And I believe it is not sensible to have pylint guess which files it should and should not lint. For example, I don't think linting setup.py or doc/conf.py is sensible, but I also don't think we should exclude them by default. The tool shouldn't try and make such decisions, that's up to the user.

I don't think we should guess. If the user do not specify anything it means they want to lint everything recursively in the current directory. This is the overwhelming result of the vote on the issue when we ask about it. This is apparent in the maybe 20+ duplicate issues that complain that pylint "misses" files because there's no __init__.py in a module or whatever (recursive search was one of the most up-voted issue for years). If users want to ignore setup.py or doc/conf.py they can ignore it there's multiple option for that.

Which tools do? I checked in our own pre-commit and autoflake, pyupgrade, isort, black, flake8, rstcheck, mypy and pydocstringformatter all require a files option. There is no tool in our pre-commit that doesn't.
Why would we break with that pattern? They probably had similar reasons to have this be their default behaviour.

flake8 (the tool the closest to pylint), pytest, pyright, ... I'm not going to search for all the tools. Small tool like pwd or ls take the current directory as default as I said in the original issue.

@DanielNoord
Copy link
Collaborator Author

You can't expect everyone to argue about it, they took the time to vote though.

I would expect at least somebody to respond.

  1. We can make the assumption that the user want to lint the current directory without being too wrong (as per popular demand)

I understand it gives some ease of use to users, but I think we can also agree that a passerby user doesn't consider all the edge cases that could be introduced with behaviour that works fine for them. That's where we as maintainers have a role to (sometimes) deviate from popular demand.

  1. Settings the "files" option or . --recursive y is not done once, it's done for every invocation of pylint in a project or directory that is not already configured, every time someone want to use pylint somewhere else than the root directory on project that are already configured (until we release the MR searching for configuration file in the root directory).

I don't understand your point here. Are you suggesting pylint shouldn't be pylint . but pylint . --recursive=y? If so, my original point still stands: why guess for users what they want and force them to put doc/conf.py and setup.py in their ignores list if we could also give them one setting that takes care of it all?

I don't think we should guess. If the user do not specify anything it means they want to lint everything recursively in the current directory. This is the overwhelming result of the vote on the issue when we ask about it. This is apparent in the maybe 20+ duplicate issues that complain that pylint "misses" files because there's no __init__.py in a module or whatever (recursive search was one of the most up-voted issue for years). If users want to ignore setup.py or doc/conf.py they can ignore it there's multiple option for that.

I made this point in a previous comment/paragraph as well: the fact that some users express their desire for something isn't a 100% justification to do something. If it was, we would integrate pylint-django, pylint-pydantic and every other plugin into our core library? I mean if I were to ask: "Would you prefer not having to install any plugins but have them run automatically?" why would anybody vote no. The point I'm making is that in this case we should consider the potential edge cases and frustration the popular demand could cause and if there is a way to satisfy the popular demand without too much hassle. I believe this option does that.

flake8 (the tool the closest to pylint), pytest, pyright, ... I'm not going to search for all the tools. Small tool like pwd or ls take the current directory as default as I said in the original issue.

As far as I can see running flake8 directly in our source directory just goes into an indefinite hang but that could be my own environment messing something up. pytest is indeed the only tool that does this consistently.
Then again, all recently developed tools don't. I think it is a good thing to follow patterns established by tools that are widely used. I'm not sure it is a good metric but both black and mypy have more stars than pylint and flake8 combined. It doesn't seem weird to follow the patterns they establish.

@Pierre-Sassoulas
Copy link
Member

I don't understand your point here. Are you suggesting pylint shouldn't be pylint . but pylint . --recursive=y?

Yes, because most of the time the current directory is not a package and we would have an error otherwise.

As far as I can see running flake8 directly in our source directory just goes into an indefinite hang but that could be my own environment messing something up.

You probably have a venv with a lot of code in it, switch to a directory with less code.

I'm not sure it is a good metric but both black and mypy have more stars than pylint and flake8 combined. It doesn't seem weird to follow the patterns they establish.

black is not even comparable, I wouldn't even think of adding pylint or flake8 without the code being first auto-formatted with isort/black.

mypy being used more means it does something better than pylint and flake8. But it does not mean they do everything better in every possible aspect. What makes mypy more successful is not "having a files options", but if I had to guess proper control flow, being maintained by Guido himself, having 100 times less false positives as a result, fixing a fundamental need in a dynamic language, and being less opinionated as its warning are about error only not about warning convention, and refactor.

passerby user doesn't consider all the edge cases that could be introduced with behaviour that works fine for them That's where we as maintainers have a role to (sometimes) deviate from popular demand.

I'm a maintainer and I don't see any worrying edge cases. I also don't like having to type . --recursive y or create a conf every time I want to casually launch pylint. I don't think we should be dogmatic about "breaking changes" to the detriment of usability.

If you have a full packaging environment with setup.py doc/conf.py and a virtualenv it means you're a power user and you're going to be able to use ignore/files or a thousand other way to specify what should be linted. You're also probably using a script to launch pylint in continuous integration or whatever and never typing anything thus you're not affected by the default value. Defaulting to the current directory help when you're a beginner trying pylint on your small scripts or someone who want to lint a particular directory inside a git repository where the pylintrc is not available. pylint adoption is hurt by the fact that it is a pain in the ass to use needs a full configuration before being even usable and being so configurable you don't even know where to start (which is a critic that is everywhere when talking about pylint). Having sensible default and usability improvement is a way to be used more.

Users expect to have this value as default, we already disagreed about this previously so we launched a vote and now we have a result. It's also not that big of a deal, can we move on and focus on something impactful please ?

@DanielNoord
Copy link
Collaborator Author

Do we still want to add this option irrespective of #5701? For power users it might make sense?

@Pierre-Sassoulas Pierre-Sassoulas added the Needs review 🔍 Needs to be reviewed by one or multiple more persons label Sep 29, 2022
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure ! The change itself look almost "too simple" 😄

doc/whatsnew/fragments/5701.other Outdated Show resolved Hide resolved
@DanielNoord
Copy link
Collaborator Author

I think that was actually the only change necessary. Rest seems like it can be merged as is?

@github-actions

This comment has been minimized.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep that in mind but I think there should be a lot more tests (probably functional config tests too). Current code looks good but I have to think about what could go wrong and use cases which would take time that I don't have right now.

@github-actions
Copy link
Contributor

🤖 According to the primer, this change has no effect on the checked open source code. 🤖🎉

This comment was generated for commit 75fdce0

@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 2.16.0, 2.17.0 Jan 8, 2023
@Pierre-Sassoulas
Copy link
Member

I've let this go stale, I'm very sorry. I wonder how we can make a simple coherent system alongside the globbing added in #8290 / #8312. I have the feeling this is all other the place and designing is necessary (but I haven't thought about the problem properly.) What do you think ?

@DanielNoord
Copy link
Collaborator Author

I feel like this is worth it either way? I don't really see how this is connected although I could change this config to use a glob instead of csv. Do you want me to do that?

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think glob is nicer than csv. But mainly I think we have a ton of option to select/unselect and I basically agree with #8312 (comment), we should refactor / simplify this part of the documentation and make sure options are consistent together.

@DanielNoord
Copy link
Collaborator Author

I think glob is nicer than csv. But mainly I think we have a ton of option to select/unselect and I basically agree with #8312 (comment), we should refactor / simplify this part of the documentation and make sure options are consistent together.

Hmm, I just remembered that this can't use glob as this is also being used for all current usages. Basically, this PR allows us to fix longer standing issues that all derive from the fact that we don't have a real files option.

Perhaps it is indeed better to fix this in 3.0 with a more general refactor of the way to supply file names.
I'll have a look if I can make a proposal for it.

@Pierre-Sassoulas
Copy link
Member

Yeah, a lot of problem are due to the unclear way to select file for pylint. Things like #8319 ? pylint $(git ls-files '*.py') should not be different than pylint .--recursive=y, right ? I agree that making the minimal options that handle all reasonable use cases the same way and remove all the concurrent options that accumulated over the years sounds really helpful in the long term. Maybe we can take exactly what ruff flake8 black or mypy do ? mypy is MIT and python we could even take their actual code directly 😅

@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 2.17.0, 3.0.0 Mar 7, 2023
@DanielNoord
Copy link
Collaborator Author

@Pierre-Sassoulas This recently came up again: we don't have a files option and thus don't display any error or help message about it.

Should this go in 3.0? And if so, how do we want to approach this issue? It's quite hard to figure out a good definition of "files" that is backwards compatible with how users can do it until now.

@Pierre-Sassoulas
Copy link
Member

I think having a default to the current directory will fix the issue of the wall of help text when giving no arguments at all. Regarding the files options itself we could add it, but imo we need to look at how it's done in other tools (Ruff in particular), and do something consistent if we break or add options to discover files.

@Pierre-Sassoulas
Copy link
Member

I want to unblock this but I don't know what's blocking it anymore ?

@DanielNoord
Copy link
Collaborator Author

I'm not sure what you meant with the last comment. Is this approach correct or are you looking to do something different?

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we definitely need to have an explicit destination for the currently nebulous/positional file args (right now it's hard to even know what is parsed and where to add a default value). I think linter.config.files always containing the files to process is the main benefit here. And being able to define the files in the conf is an added feature on top of it.

@DanielNoord
Copy link
Collaborator Author

Yes, we definitely need to have an explicit destination for the currently nebulous/positional file args (right now it's hard to even know what is parsed and where to add a default value). I think linter.config.files always containing the files to process is the main benefit here. And being able to define the files in the conf is an added feature on top of it.

Okay, but the tests passed previously. Is there anything missing from this other than a rebase?

@Pierre-Sassoulas
Copy link
Member

Okay, but the tests passed previously. Is there anything missing from this other than a rebase?

I didn't see any blocker thus my searching for what's blocking 😄 Let's rebase first.

Reading the previous review (#7496 (review)), I don't think we had a reflection concerning csv vs globbing vs regex and what we have in pylint currently to make it consistent internally and also taking inspiration from how other tools like mypy/ruff are doing it afaik. What's the rational for the current design ? To be fair I think those options (regex/glob/csv) have grown in pylint each time someone asked to be able to do something -- organically. I wouldn't be able to tell you what is pylint doctrine for discovering files or filtering/affecting files with options. Should we offer all three methods in every options ? What type of file discovery should files implement ? In CLI I think it's intuitive that it's bash regexes that we need, but not so much in a configuration file. Should we make those consistent in 4.0 ? I'm making those reviews at the end of long days and really rarely have the time to do in depth / consistency design research. (I've not even re-read the whole thread to get back into the context this time).

@jacobtylerwalls jacobtylerwalls modified the milestones: 3.1.0, 3.2.0 Feb 24, 2024
@jacobtylerwalls jacobtylerwalls modified the milestones: 3.2.0, 3.3.0 May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ Improvement to a component Needs review 🔍 Needs to be reviewed by one or multiple more persons
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants