-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a section for Team Standards in the team compass #454
Comments
Adopting auto-format PRs is often very disruptive to open PRs and ~never a pressing issue, so I think it's a good idea to be circumspect any time we adopt a new tool, and try to prioritize open PR review whenever such a PR is likely to cause lots of conflicts I like both having standards and consistency and being informal and letting subprojects organically develop their own practices. Most jupyterhub repos are on a pretty thin maintenance resources right now, and 'just do what we do everywhere else' is often the lightest burden. I like having a documented central place for what we've generally agreed upon makes sense, and the proposed collection sounds good. I like the 'trial on one repo, then spread if we like it' approach. I think that matches positive experiences we've had so far with e.g. adopting pre-commit itself. A question, though: is it better to have a 'default' pre-commit-config.yaml that we ship with the docs, or should we have e.g. the file in jupyterhub/jupyterhub be the 'common' starting point? It might be nice to not maintain a duplicate, but not everyone has developed the knowledge to necessarily recognize what parts of jupyterhub/jupyterhub's config are generic and which are peculiar. One more thing to add here might be publishing releases from GHA. |
I like @minrk's general approach of "both having standards and consistency and being informal and letting subprojects organically develop their own practices". I'd prefer pointing to example pre-commit configs in one or more existing repos as we can guarantee they're regularly tested, whereas anything embedded ina doc is liable to go out of date. I'd still prefer the configs to be suggestions rather than mandatory though. If you're converting an existing large repo you may want to be more lenient with things like a flake8 config due to the amount of manual work invovled in fixing all warnings, whereas if it's a new repo or you've already invested time in applying strict coding standards it'd be a shame to apply a lax config and for the code quality to drift, or to throw away other pre-commit hooks such as mypy. From a developers point of view I don't think it makes much difference if different repos have different pre-commit configs. They all use a common tool so you just blindly run
There's are two related issues: |
I agree w/ both of your takes that we don't want a heavy hand here - my goal is to create guidelines and make it clear when we don't need to have lots of discussion before moving forward. Also to provide some general encouragement for these practices by documenting them. (I also think it probably matters more for the "core repositories" than for the ones that are not as heavily used) I'm thinking of scenarios like:
And scenarios like:
|
Yeah, I think that's a good level to do it at, and write down "we're pretty sold on pre-commit, black, correctness-level flake8, etc." and "we're still trying out pyupgrade" |
Ok! How does this sound: JupyterHub projects should take advantage of pre-commit to run autoformatters and code linters. This helps to maintain code standards, and improves maintainability. The following pre-commit hooks can be added to any repo when convenient, though please communicate with anyone who has an open pull request if it will lead to major conflicts:
Use your judgement as to whether to apply a strict or lenient configuration for linting code. See REPO-A and REPO-B for example configurations. Other linters, autoformatters and tools can be added to other repos on an ad-hoc basis if it's not too disruptive- this is a good way to try out new tools. In general big changes should not be made to high profile repos without prior discussion. If the new tool is useful across the organisation please propose it in a new team-compass GitHub issue, outlining the advantages and disadvantages. |
That looks nice to me, thanks @manics for writing it up :-) |
❤️ thanks for processing this! I'm very happy about a standards section in the team compass to suggest additions to. Prettier: markdown, yaml, json, js, html - it does a lot!Note that Also note one may consider running prettier specifically for YAML and then once agains specifically for JS etc, because perhaps YAML indentation should be 2 spaces, while JS should be 4. Where to locate .pre-commit-config.yaml examples?I'm positive to @manics idea about referencing a repo, but I note that there will be a need to reference multiple repo's as no single repo contains all relevant hooks for all code bases. Due to that, I'm overall also supporting the idea of having a team-compass maintained config file that can be copy-pasted, even though it absolutely will add some maintenance burden. I consider it to be a quite even trade off, so I'm fine with either referencing 2-3 repo's for example configs or adding a dedicated example config in team-compass. Misc thingsI also like these misc hooks, of which end-of-file-fixer and requirements-txt-fixer are autoformatters. - repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.0.1
hooks:
- id: end-of-file-fixer
- id: check-executables-have-shebangs
- id: requirements-txt-fixer |
You can have different configs for different file extensions in your prettier config file, it's not necessary to run it multiple times: https://prettier.io/docs/en/configuration.html However I'd probably stick with the prettier defaults unless there's a strong reason not to, e.g. I find JavaScript tends to be more nested than some other languages so 2 spaces is better 😄. It looks like https://github.com/jupyterlab/extension-cookiecutter-ts uses 2 spaces. I don't think we need to get too proscriptive now about the configuration here though- we'll probably have to iterate as we modify more repos. |
I like consistent code and file formatting as it makes life easier when reading. I believe (:D) in the two lines about consistency from the "Zen of Python". I think the time to introduce and enforce consistency/tooling is when a project is young. Once a project gets older, has more users and more developers it becomes harder and more disruptive to do so. This means I value local consistency over global consistency. As an example: if all the code around where I am editing something uses single quotes for strings, use single quotes (even if I think double quotes are the right thing to do). The cost of reformatting all the quotes or having a mix is much higher than the benefit of improved readability. If a repo uses tabs for indentation, use tabs when sending PRs. Even if the rest of the GH organisation uses spaces. And so on. This means that I think it is useful to have a recommended standard set of things to add to new repositories, especially because it helps with questions like "so how do I setup pre-commit black again?". For projects with more users and developers I think retroactively introducing rules that are strictly enforced (by a tool) is rarely a good use of our small amount of developer time. These changes are often disruptive, tricky and need a lot of attention to detail to get right. In addition mature and popular projects have kinda proven that despite their inconsistent formatting/mix of code styles/etc they are useful to a large group of users. Otherwise they would not have become large and mature. For me this means more consistency/tooling slide towards a "nice to have" for more mature projects. TL;DR: having a shared, minimal set of tooling to help us quickly setup new projects and increase consistency across repos is a good thing. Applying them retroactively to existing, large projects is a much more difficult task, so I would not support automatically/by default/without discussion adding what we recommend for new repos to these repos. |
Can you go into more detail how it is a lot of work to apply standards via pre-commit to pre existing codebases ? The only thing I can think of is that it would introduce conflicts for pre-existing PRs (and I agree there we should go through extra efforts to get PRs merged or closed before we applied anything like this). Other than that, it'd just be a single PR that:
Then we merge quickly with minimal review before other PRs get opened. Personally, that doesn't seem too bad, as long as there isn't some long-standing PR that you really don't want to close, but that you don't expect to be merged soon. Note: I think that this does become more complex when the "standard libraries" you adopt are themselves more complex or niche. But if it were just "black, flake8, and isort" or something, I feel like it wouldn't be too bad. Pyupgrade feels more niche to me (though as others have noted it often results in a no-op diff) |
It is more work because in a new project it is nearly no work :) Looking at and deciding about open PRs, installing the extra CI step, maintaining the extra CI step, updating the docs to reflect a mixture of the possibly existing style/contribution guide and the new one that involves the pre-commit tools, deal with vendored code that should/shouldn't be reformatted, fixing things that the tools flag but can't automatically fix, adding exemptions from flake8 rules, "training" the contributors to write so that the tools are happy (I think it is a bug if the bot has to come and fix up stuff for you on every PR, would suggest to me that it is somehow "unnatural" to code the way the bot wants you to code or that the style changes aren't perceived as providing value), figure out why on different contributors computers the tools produce different results (yes I just spent far too much time with a team that chased prettier config settings ...), etc. Each one isn't a big step or a lot of work but they all need doing and they can be a bit tedious. Most of these don't apply if you are starting a new project because a lot of legacy stuff doesn't exist. Side-thought: I think you could deal with all the existing PRs by producing a short bit of instructions for them on how to run the pre-commit tooling in their PR. I think after you apply the pre-commit tools to "main" and then also in the PRs ,you should not have any more (formatting related) conflicts and only see the actual changes of that PR in the PR's diff. But that is just an idea, I've not tested it. Would be nice to save on effort dealing with existing PRs. |
Working autoformatters have only really come about in the past few years. Before that code style was enforced by whoever reviewed the PR, so some people are quite lax, others are strict, but since it was the only option everyone just dealt with it. You're right that adding pre-commit to existing repositories involves work on our part, and you've correctly pointed out several potential problems. However it's hopefully obvious that several of us are willing to do that work, and my suggestion above of Use your judgement as to whether to apply a strict or lenient configuration for linting code is specifically aimed at this, we can choose what level of strictness to enforce. In a commercial context there's a cost-benefit trade off, but that doesn't really apply here since everyone is pretty much free to choose what they want to work on, so I don't see this additional work as a problem. Once it's added to CI there should be no or minimal maintenance. The pre-commit config file pins exact versions of tools, so they should behave identically across all systems, if they don't it's a bug. We can choose to bump tool versions or not. That leaves the problem of ensuring contributors follow the style and lint rules. With an autoformatter it's easy: It's also worth bearing in mind that many developers use IDEs that automatically format code, I've seen a few JupyterHub PRs where a whole file has been autoformatted. As you expect in those cases we've had to ask for just the relevant changes to be pulled out, but I think it's fair to say autoformatters are pretty standard these days. |
Some additional wording added to @manics (thanks!) suggested text: JupyterHub projects should take advantage of pre-commit and tools to help maintain consistent formatting within a repo to improve overall code quality, review efficiency, and readability of code. Sensible use of pre-commit to run formatting tools and code linters can add consistency and improve maintainability. Preferred tools and pre-commit hooksThe team has found the following tools and their pre-commit hooks to be useful. The following pre-commit hooks can be added to any repo when convenient, though please communicate with anyone who has an open pull request if it will lead to major conflicts:
Applying to reposWhen creating a new repo, please use any pre-commit hooks and tools that are useful. When working with an existing repo, please balance the benefits of adding a tool or pre-commit hook with considerations such as 1) the amount of code churn, 2) how it will improve code maintainability, 3) the time it may add to CI runs. Configuration of a toolUse your judgment as to whether to apply a strict or lenient configuration for linting code. See REPO-A and REPO-B for example configurations. Other linters, autoformatters and tools can be added to other repos on an ad-hoc basis if it's not too disruptive- this is a good way to try out new tools. In general, big changes should not be made to high profile repos without prior discussion. Proposing organization-wide use of a toolIf the new tool is useful across the organisation please propose it in a new team-compass GitHub issue, outlining the advantages and disadvantages. |
That looks great, thanks @willingc!
I think it's a little more complicated than that, because it matters how/when you apply those changes. I think you'll still get conflicts if you try to rebase a PR, even if it's a single commit that's been autoformatted. However, applying autoformatting and then merging from main ought to have fewer issues. Probably not zero, though. The way to avoid conflicts, I think, is to 'rebase' manually, without the actual rebase or merge commands, but this is a pretty git-expert-level sequence of operations:
This will always reduce a PR to just one new commit, but it's impossible for there to be merge conflicts with this process, since git isn't making any decisions about how to reapply changes. I feel safe doing this kind of history rewriting because I can always dig through the |
Proposed change
Following a conversation in #453 - I think it'd be helpful if we defined some "official team standards" for the repositories under the
jupyterhub/
org. We've had a number of conversations about things like black, pyupgrade, pre-commit, documentation conventions, etc. Rather than discussing on a case-by-case and informal basis, how about we define a few basic things that we "approve of" as a community-wide practice.These standards should be relatively lightweight and applicable in most repositories. We could host them in a "Team Standards" section in the team guides part of our documentation. Anything that was listed there is "pre-approved" for implementation in one of the repositories. If something isn't there and a person wants to implement it (especially for a core repository like
jupyterhub/
, then they should open an issue to discuss.Alternative options
The main alternative is to keep our current informal process, which isn't working too badly, though I've noticed that it can lead to some confusion about what kinds of PRs will be "easy to merge" when it comes to things like adding new pre-commit hooks.
Who would use this feature?
The main beneficiary is developers in our repositories. I think it could have a few benefits:
(Optional): Suggest a solution
I think that we should try to keep the "community-wide standards" relatively lightweight and restrict it as much as possible to things that can be automated. So something like:
pre-commit
to enforce standards (to start, we can use the hooks in the jupyterhub repo minuxpyupgrade
as accepted standards)black
+flake8
+reorder-imports
for Python codeprettier
for JS/TS code(I might be missing something, this is just top-of-my-head)
And an example of an issue that wishes to extend these standards would be #453 or #379. So the conversations there might play out like "first try it out on a single repository (like how JupyterHub already uses
pyupgrade
, and then after 2 months decide if we want to adopt it broadly).The text was updated successfully, but these errors were encountered: