The Flock LLM Validator script is a modular system for doing validation for Flock AI Arena validation assignments. It is designed to manage an isolated conda environment for each task type, fetch validation assignments from the Flock API, execute the validation, and submit results back to the API. The system is extensible, allowing new validation modules or task types to be added easily.
The project is split into two layers:
- The outer layer that runs outside conda and is responsible for managing and running conda environments for each module.
- The inner layer that runs inside conda and is responsible for all validation and assignment orchestration logic.
- Install miniconda (e.g. miniconda)
- Install dependencies:
pip install -r requirements.txt
Use the CLI to start the validation process. The CLI settings are defined in validator/entrypoint.py
.
python run.py \
<module> \
--task_ids <task_id1,task_id2,...> \
--flock-api-key <your_flock_api_key> \
--hf-token <your_hf_token>
<module>
: Name of the validation module to use (e.g.,lora
).--task_ids
: Comma-separated list of task IDs to validate.--flock-api-key
: Flock API key (can also be set via theFLOCK_API_KEY
environment variable).--hf-token
: HuggingFace token (can also be set via theHF_TOKEN
environment variable).--time-sleep
: (Optional) Time to sleep between retries in seconds (default: 180).--assignment-lookup-interval
: (Optional) Assignment lookup interval in seconds (default: 180).--debug
: (Optional) Enable debug mode.
python run.py lora --task_ids 123,456 --flock-api-key $FLOCK_API_KEY --hf-token $HF_TOKEN
You can set the following environment variables instead of passing them as CLI options:
FLOCK_API_KEY
HF_TOKEN
TIME_SLEEP
ASSIGNMENT_LOOKUP_INTERVAL
- Create a new subdirectory or file in
validator/modules/
. - Implement a class that inherits from
BaseValidationModule
and define the required schemas. - Expose your module class as
MODULE
in the module's__init__.py
. - The system will dynamically load your module when specified via the CLI.
- Module and task-specific configuration files should be placed in the
configs/
directory. - Per-task-type config:
configs/<module>.json
- Per-task-id config (which overrides the per-task-type config):
configs/tasks/<task_id>.json
- Recoverable errors (subclass
RecoverableException
) will not mark the assignment as failed and will log the error. - Other exceptions will mark the assignment as failed.