Skip to content

Latest commit

 

History

History
177 lines (108 loc) · 7.51 KB

CONTRIBUTING.md

File metadata and controls

177 lines (108 loc) · 7.51 KB

Contributing to Ploomber

Thanks for considering contributing to Ploomber!

For general information, see Ploombers' contribution guidelines.

Issues tagged with good first issue are great options to start contributing.

If you get stuck, open an issue or reach out to us on Slack and we'll happily help you.

If you're contributing to the documentation, go to doc/CONTRIBUTING.md.

Setup with conda

The easiest way to setup the development environment is via the setup command; you must have miniconda installed. If you don't want to use conda, skip to the next section.

Click here for miniconda installation details.

Make sure conda has conda-forge as channel, running the following:

conda config --add channels conda-forge

Once you have conda ready:

Fork the repository to your account by clicking fork button

Now ready to clone and setup the environment:

# get the code
git clone https://github.com/ploomber/ploomber

# invoke is a library we use to manage one-off commands
pip install invoke

# move into ploomber directory
cd ploomber

# setup development environment
invoke setup

Note: If you're using Linux, you may encounter issues running invoke setup regarding the psycopg2 package. If that's the case, remove psycopg2 from the setup.py file and try again.

Then activate the environment:

conda activate ploomber

Setup with pip

Ploomber has optional features that depend on packages that aren't straightforward to install, so we use conda for quickly setting up the development environment. But you can still get a pretty good development environment using pip alone.

[Optional] Create virtual environment

Note: we highly recommend you to install ploomber in a virtual environment (the most straightforward alternative is the venv built-in module):

# create virtual env
python -m venv ploomber-venv

# activate virtual env (linux/macOS)
source ploomber-venv/bin/activate

# activate virtual env (windows)
.\ploomber-venv\Scripts\activate

Note: Check venv docs to find the appropriate command if you're using Windows.

Install dependencies

# required to run the next command
pip install invoke

# install dependencies with pip
invoke setup-pip

Note: If you're using Linux, you may encounter issues running invoke setup regarding the psycopg2 package. If that's the case, remove psycopg2 from the setup.py file and try again.

Caveats of installing with pip

Conda takes care of installing all dependencies required to run all tests. However, we need to skip a few of them when installing with pip because either the library is not pip-installable or any of their dependencies are. So if you use invoke setup-pip to configure your environment, some tests will fail. This isn't usually a problem if you're developing a specific feature; you can run a subset of the testing suite and let GitHub run the entire test suite when pushing your code.

However, if you wish to have a full setup, you must install the following dependencies:

  1. pygrapviz (note that this depends on graphviz) which can't be installed by pip
  2. IRKernel (note that this requires an R installation)

Checking setup

Make sure everything is working correctly:

# import ploomber
python -c 'import ploomber; print(ploomber)'

Note: the output of the previous command should be the directory where you ran git clone; if it's not, try re-activating your conda environment (i.e., if using conda: conda activate base, then conda activate ploomber) If this doesn't work, open an issue or reach out to us on Slack.

Run some tests:

pytest tests/util

Branch name requirement

To prevent double execution of the same CI pipelines, we have chosen to set a limitation to github push event. Only pushes to certain branches will trigger the pipelines. That means if you have turned on github action and want to run workflows in your forked repo, you will need to either make pushes directly to your master branch or branches name strictly following this convention: dev/{your-branch-name}.

On the other hand, if you choose not to turn on github action in your own repo and simply run tests locally, you can disregard this information since your pull request from your forked repo to ploomber/ploomber repo will always trigger the pipelines.

Linting

Note: ploomber/ploomber is the only project where we use yapf, other projects have moved to black

We use yapf for formatting code. Please run yapf on your code before submitting:

yapf --in-place path/to/file.py

If you want git to automatically check your code with flake8 before you push to your fork, you can install a pre-push hook locally:

# to install pre-push git hook
invoke install-git-hook

# to uninstall pre-push git hook
invoke uninstall-git-hook

The installed hook only takes effect in your current repository.

Testing

  • Ploomber loads user's code dynamically via dotted paths (e.g., my_module.my_function is similar to doing from my_module import my_function). Hence, some of our tests do this as well. Dynamic imports can become a problem if tests create and import modules (i.e., create a new .py file and import it). To prevent temporary modules from polluting other tasks, use the tmp_imports pytest fixture, which deletes all packages imported inside a test
  • Some tests make calls to a PostgreSQL database. When running on Github Actions, a database is automatically provisioned, but the tests will fail locally.
  • If you're checking error messages and they include absolute paths to files, you may encounter some issues when running the Windows CI since the Github Actions VM has some symlinks. If the test calls Pathlib.resolve() (resolves symlinks), call it in the test as well, if it doesn't, use os.path.abspath() (does not resolve symlinks).

Locally running GitHub actions

Debugging GitHub actions by commiting, pushing, and then waiting for GitHub to run them can be inconvenient because of the clunky workflow and inability to use debugging tools other than printing to the console

We can use the tool act to run github actions locally in docker containers

Install then run act in the root directory. On the first invocation it will ask for a size. Select medium. act will then run actions from the .github/workflows directory

Working with containers

If the tests fail, act will leave the docker images after the action finishes. These can be inspected by running docker container list then running docker exec -it CONTAINER_ID bash where CONTAINER_ID is a container id from docker container list

To install packages in the container, first run apt-get update. Packages can be installed normally with apt after