https://cleanlab.github.io/cleanlab-studio/dev/bench/
- To install the package locally for development, clone this repo and run
pip install --editable .
from the home directory. Changes to the code are reflected automatically in the CLI. Makefile
contains sample commands for quick installation and testing, though you will have to specify filepaths and API keys manually.- Run
export CLEANLAB_API_BASE_URL="http://localhost:8500/api"
so that API requests are made on your local machine
To run all test locally:
export CLEANLAB_API_BASE_URL=CLEANLAB_API_BASE_URL
cleanlab login --key <CLEANLAB-API-KEY>
pytest --verbose
When making changes to any code that touches the TLM, sucessfully passing the TLM tests is required before the PR can be merged. To launch the TLM test, comment /test-tlm
in the PR.
Cleanlab CLI uses Black to standardize code formatting. Black is configured
with pyproject.toml
.
Developers should also set up pre-commit hooks to re-format any changed code prior to a commit. The configuration for
pre-commit is in pre-commit-config.yaml
and pre-commit-hooks.yaml
.
On every push to the repo, GitHub Actions checks for formatting issues using Black as well. The configuration options
are in .github/workflows/format.yml
.
To set up pre-commit:
python -m pip install -r requirements_dev.txt
pre-commit install
To run the formatter manually:
black .
- Update the version number in the repo (see Incrementing the package version number)
- Commit the changes in a commit titled with the new version number
v0.[x].[y]
(substitutex
andy
). - Tag the commit with the new version number.
- Push the commit to
main
The tagging of the commit should trigger the GitHub Actions workflow to build and release the package.
References:
- https://packaging.python.org/en/latest/tutorials/packaging-projects/
- https://twine.readthedocs.io/en/stable/
- python3 -m pip install --upgrade build
- python3 -m pip install --upgrade twine
- rm -rf dist (if present)
- python3 -m build
To upload to TestPyPi:
- twine upload -r testpypi dist/*
- pip install -i https://test.pypi.org/simple/ cleanlab-studio
This last step may fail if test versions of some required packages are not available.
To upload to PyPi:
- twine upload dist/*
For users, there is only one version number to keep track of: the CLI package version.
For developers, there are four version numbers to keep track of:
- The CLI package version
- Cleanlab Studio CLI API version (currently v0)
- The schema version number
- The CLI settings version number
The latest version numbers for (1), (2), and (4) are stored in version.py.
Each version of the CLI supports some:
MIN_SCHEMA_VERSION
: Minimum schema version numberMIN_SETTINGS_VERSION
: Minimum CLI settings version number
If a user provides a schema with version number < MIN_SCHEMA_VERSION
, it cannot be used. A new one must be generated.
If a CLI settings file has a version number < MIN_SETTINGS_VERSION
, it cannot be used. The CLI will attempt to migrate
it (with the user's permission).
Similarly, each version of the Cleanlab Studio CLI API supports some:
MIN_CLI_VERSION
: Minimum CLI version number
The CLI, upon initializing, pings the CLI API with its version number to check if it is compatible. If the CLI
version < MIN_CLI_VERSION
, then the user is prompted to upgrade their cleanlab-studio
package.
Each version of the CLI also supports some:
MAX_SCHEMA_VERSION
: Maximum schema version numberMAX_SETTINGS_VERSION
: Maximum CLI settings version number
These are the maximum versions for the schema / settings that the CLI is able to handle. Every time the schema /
settings version is incremented, the MAX_SCHEMA_VERSION
/ MAX_SETTINGS_VERSION
should be updated as well.
For each release of the CLI, update the minimum supported versions whenever there is a change in how the CLI interfaces with the settings or schema.
- The CLI now expects the Settings / Schema to have a new key, so older settings / schema would be incompatible with the new CLI.
- CLI does additional checks on schema that it did not do before, but the schema format is unchanged
- CLI adds new functionality for interfacing with Cleanlab Studio CLI API, but no new behavior is introduced for interfacing with settings / schema
Whenever the CLI API is updated, update the minimum supported CLI version when there is a change in the API, which changes the interface between API and CLI in a way that breaks compatibility. Every endpoint in the CLI API is used by the CLI in some way, so any change must be followed by the question: Does this change in the API break the oldest supported versions of the CLI?
- API endpoint returns different values from before —
int
vsstring
,tuple
vsdict
- Refactoring internal implementation but the endpoints and their returned values do not change
- API supports additional endpoints compared to before
- API endpoints return values change but not in a compatibility breaking way — e.g. returns a
dict
with an additional key. Depending on how the CLI uses thedict
— checkdict
length vs fetch expected keys — this may be fine!
Every time the version number is incremented, these parts of the codebase need to be updated:
cleanlab_studio/version.py
README.md
We autogenerate documentation for the studio
module from docstrings. In order for the generated docs to be formatted correctly, use Google style docstrings (example)
If you are deprecating a method, please add the method name to deprecated_methods.yaml
here so that it is properly hidden from the docs page.
- To link to other functions in the
studio
module within a docstring, use the following format:[link text](#method-function_name)
- To link to another page in Cleanlab Studio docs use
[link text](/{path within cleanlab-studio-docs/docs})
If there any function/class docstrings that you would like to hide from the API documentation page, include lazydocs: ignore
in the docstring of that function/class to let our docs build know to ignore displaying that function.