Consistent python style, linting and formatting for `scripts` folder #245

maltekuehl · 2021-12-18T13:40:28Z

Currently the python scripts in the ./scripts/ folder do not seem to strictly follow any linting, formatting, typing or style guide, docstrings are often missing and no configuration files to ensure code style consistency across IDEs seem to be present.

To enable consistent collaboration, it might perhaps be of interest to setup:

Linting, e.g. with flake8 or pylint
Formatting, e.g. wtih black
Import order sorting, e.g. with isort
Enforcing of docstrings, e.g. with flake8-docstrings and Google Style Python Docstrings
Enforcing consistent styles before commit with a pre-commit hook
Testing, for example with pytest
Linting as part of the CI/CD pipeline

The text was updated successfully, but these errors were encountered:

emmahodcroft · 2021-12-20T12:43:13Z

Hi @maltekuehl thank you for this suggestion! I'm afraid I'm not very strongly versed in linting. I generally code with Visual Studio Code, if there's something specific you'd recommend for helping to do this through that. Help would be welcomed to address this, as for me it is likely low on the priority list with the very limited resources we have! :)

maltekuehl · 2021-12-20T15:48:16Z

Thanks for the comment, @emmahodcroft! With this issue I just wanted to check if there's interest in this or if anyone's already working on it.
I originally took a look at the scripts folder in the context of #243 as I wanted to see what it would take to integrate the JHU/OWID prevalence data in the JSON data. Since I didn't want to randomly mess around with the code without a proper structure or introduce unwanted linting/formatting, I thought creating this issue might be a good first step :)

Now if you say that help would be welcomed, I might take a look at this and implement some basic linting and formatting over the Christmas days if I manage to have some time off and hopefully this will facilitate future additions to the python code.

ivan-aksamentov · 2021-12-20T16:57:22Z

Hi @maltekuehl,

Thanks for the suggestion. As you probably know, due to huge workload, academic researchers are not typically willing to spend time on learning and using the best practices, even for tools they use routinely, because there are more pressing activities they need to pursue. As an engineer, I'd be happy if the code was formatted and linted, but the most important thing is to not make obstacles for Emma to produce the results quickly, because they are so much needed during pandemic. In other words, linting is okay, as long as things gets done. Docs are probably is a complete waste of time, because this is not a library, the code changes very fast, and nobody ever reads it except Emma. Tests will be almost impossible, because Emma will not write them for the new code and the old ones will get obsolete the moment they are written.

There's been attempts previously, like this:
#131

But important things to note is that

the checks should not be too noisy, i.e. we cannot force Emma's text editor to become permanently decorated with thousands of red and yellow squiggles. This means that the checks will need to be run separately from the development process. And there will likely be need to make these checks somewhat relaxed in the beginning.
the checks should be optional and to not prevent from running and releasing anything
I can imagine a periodic process where an engineer comes, runs lints and applies a few fixes/refactorings. Not too many at a time to ensure it does not break science
Python code runs offline and only the output JSON results of these runs (which we call "web data") are used in the web app (which is a static React app hosted on S3 and does not have any programmable back-end server). In other words, this code does not run in production, so it does not make sense to block the build and deployment like it is done in the linked PR.
The Python code is hard to run, because it requires large amounts of private data (which cannot be redistributed due to GISAID terms of service). That is, validating the result of refactoring is near impossible for a 3rdparty contributor.
It might be more useful to perform validation of the JSON results instead
Formatting will be super nice, but Emma will need to remember to use it.
Pre-commit hook will probably end in a disaster. Emma runs Windows and is not on very good terms with git :)
Not Python, but related: The web app is linted on build, on push and on deploy. There isn't much to test, because the web app is somewhat dummy - it mostly renders markdown and some plots from JSON. But it wouldn't hurt. Emma does not deal with web app and I will likely not have time to work on this.

Happy to hear your thoughts and to consider contributions!

…odcroftlab#245.

maltekuehl added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed needs triage Pending maintainers' attention labels Dec 18, 2021

emmahodcroft removed the needs triage Pending maintainers' attention label Dec 20, 2021

maltekuehl added a commit to maltekuehl/covariants that referenced this issue Dec 30, 2021

Provide linting and formatting setup with black and flake8. Resolves h…

54519d5

…odcroftlab#245.

maltekuehl linked a pull request Dec 30, 2021 that will close this issue

Provide (more) consistent linting and formatting with black and flake8 #251

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent python style, linting and formatting for `scripts` folder #245

Consistent python style, linting and formatting for `scripts` folder #245

maltekuehl commented Dec 18, 2021

emmahodcroft commented Dec 20, 2021

maltekuehl commented Dec 20, 2021

ivan-aksamentov commented Dec 20, 2021 •

edited

Consistent python style, linting and formatting for scripts folder #245

Consistent python style, linting and formatting for scripts folder #245

Comments

maltekuehl commented Dec 18, 2021

emmahodcroft commented Dec 20, 2021

maltekuehl commented Dec 20, 2021

ivan-aksamentov commented Dec 20, 2021 • edited

Consistent python style, linting and formatting for `scripts` folder #245

Consistent python style, linting and formatting for `scripts` folder #245

ivan-aksamentov commented Dec 20, 2021 •

edited