Thanks for considering contributing to Ploomber!
For general information, see Ploombers' contribution guidelines.
Issues tagged with good first issue are great options to start contributing.
If you get stuck, open an issue or reach out to us on Slack and we'll happily help you.
If you're contributing to the documentation, go to doc/CONTRIBUTING.md.
The easiest way to setup the development environment is via the setup command; you must have miniconda installed. If you don't want to use conda, skip to the next section.
Click here for miniconda installation details.
Make sure conda has conda-forge as channel, running the following:
conda config --add channels conda-forge
Once you have conda ready:
Fork the repository to your account by clicking fork button
Now ready to clone and setup the environment:
# get the code
git clone https://github.com/ploomber/ploomber
# invoke is a library we use to manage one-off commands
pip install invoke
# move into ploomber directory
cd ploomber
# setup development environment
invoke setup
Note: If you're using Linux, you may encounter issues running invoke setup
regarding the psycopg2
package. If that's the case, remove psycopg2
from the setup.py
file and try again.
Then activate the environment:
conda activate ploomber
Ploomber has optional features that depend on packages that aren't straightforward to install, so we use conda
for quickly setting up the development environment. But you can still get a pretty good development environment using pip
alone.
Note: we highly recommend you to install ploomber in a virtual environment (the most straightforward alternative is the venv built-in module):
# create virtual env
python -m venv ploomber-venv
# activate virtual env (linux/macOS)
source ploomber-venv/bin/activate
# activate virtual env (windows)
.\ploomber-venv\Scripts\activate
Note: Check venv docs to find the appropriate command if you're using Windows.
# required to run the next command
pip install invoke
# install dependencies with pip
invoke setup-pip
Note: If you're using Linux, you may encounter issues running invoke setup
regarding the psycopg2
package. If that's the case, remove psycopg2
from the setup.py
file and try again.
Conda takes care of installing all dependencies required to run all tests. However, we need to skip a few of them when installing with pip because either the library is not pip-installable or any of their dependencies are. So if you use invoke setup-pip
to configure your environment, some tests will fail. This isn't usually a problem if you're developing a specific feature; you can run a subset of the testing suite and let GitHub run the entire test suite when pushing your code.
However, if you wish to have a full setup, you must install the following dependencies:
- pygrapviz (note that this depends on graphviz) which can't be installed by pip
- IRKernel (note that this requires an R installation)
Make sure everything is working correctly:
# import ploomber
python -c 'import ploomber; print(ploomber)'
Note: the output of the previous command should be the directory where you ran git clone
; if it's not, try re-activating your conda environment (i.e., if using conda: conda activate base
, then conda activate ploomber
) If this doesn't work, open an issue or reach out to us on Slack.
Run some tests:
pytest tests/util
To prevent double execution of the same CI pipelines, we have chosen to set a limitation to github push event. Only pushes to certain branches will trigger the pipelines. That means if you have turned on github action and want to run workflows in your forked repo, you will need to either make pushes directly to your master branch or branches name strictly following this convention: dev/{your-branch-name}
.
On the other hand, if you choose not to turn on github action in your own repo and simply run tests locally, you can disregard this information since your pull request from your forked repo to ploomber/ploomber repo will always trigger the pipelines.
Note: ploomber/ploomber is the only project where we use yapf, other projects have moved to black
We use yapf for formatting code. Please run yapf on your code before submitting:
yapf --in-place path/to/file.py
If you want git to automatically check your code with flake8
before you push to your fork, you can install a pre-push hook locally:
# to install pre-push git hook
invoke install-git-hook
# to uninstall pre-push git hook
invoke uninstall-git-hook
The installed hook only takes effect in your current repository.
- Ploomber loads user's code dynamically via dotted paths (e.g.,
my_module.my_function
is similar to doingfrom my_module import my_function
). Hence, some of our tests do this as well. Dynamic imports can become a problem if tests create and import modules (i.e., create a new.py
file and import it). To prevent temporary modules from polluting other tasks, use thetmp_imports
pytest fixture, which deletes all packages imported inside a test - Some tests make calls to a PostgreSQL database. When running on Github Actions, a database is automatically provisioned, but the tests will fail locally.
- If you're checking error messages and they include absolute paths to files, you may encounter some issues when running the Windows CI since the Github Actions VM has some symlinks. If the test calls
Pathlib.resolve()
(resolves symlinks), call it in the test as well, if it doesn't, useos.path.abspath()
(does not resolve symlinks).
Debugging GitHub actions by commiting, pushing, and then waiting for GitHub to run them can be inconvenient because of the clunky workflow and inability to use debugging tools other than printing to the console
We can use the tool act
to run github
actions locally in docker containers
Install then run act
in the root directory. On the first invocation it will
ask for a size. Select medium. act
will then run actions from the
.github/workflows
directory
If the tests fail, act will leave the docker images after the action finishes.
These can be inspected by running docker container list
then running
docker exec -it CONTAINER_ID bash
where CONTAINER_ID
is a container id
from docker container list
To install packages in the container, first run apt-get update
. Packages
can be installed normally with apt after