pyro-eval

Library to evaluate Pyronear ML models 🔥

Context

This module aims at providing an evaluation pipeline to measure and commpare the performance of pyronear algorithms. It is split in two parts:

Dataset management
Metrics computation

Installation

Python dependencies

Make sure you have uv installed, then clone this repo and install dependencies:

git clone [email protected]:earthtoolsmaker/pyro-eval.git
uv sync

Note:

This repo the pyro-engine repo as a dependency: Make sure to run uv sync to retrieve changes made on this repo.

Data dependencies

To get the data dependencies one can use DVC - To fully use this repository you would need access to our DVC remote storage which is currently reserved for Pyronear members. On request, you will be provided with AWS credentials to access our remote storage.

Pull all the data files tracked by DVC using this command:

dvc pull

Scaffolding

One can use the default ./data folder to store datasets and models to run evaluation on:

Models: One can use the ./data/models/ folder to store models to evaluate.
Datasets: One can use the ./data/datasets/ folder to store the datasets.
Evaluation Results: By default, the results of the evaluation runs are stored under ./data/evaluation/.

Example of files under ./data/:

$ tree -L 3
.
├── datasets
│   ├── gitkeep
│   ├── wildfire_test
│   │   ├── data.yaml
│   │   ├── images
│   │   ├── labels
│   │   └── wildfire_test.csv
│   └── wildfire_test_temporal
│       ├── data.yaml
│       ├── images
│       ├── labels
│       └── wildfire_test_temporal.csv
├── evaluation
│   ├── gitkeep
│   └── runs
│       └── run-20250522-1457-7552
└── models
    ├── artistic-alpaca_v1.1.0_fe129f2.onnx
    ├── artistic-alpaca_v1.1.0_fe129f2.pt
    └── gitkeep

Usage

run_evaluation.py

This script runs the evaluation of the models on the provided test dataset.

uv run python ./scripts/run_evaluation.py \
  --dir-models ./data/models/ \
  --dir-dataset ./data/datasets/wildfire_test/ \
  --dir-temporal-dataset ./data/datasets/wildfire_test_temporal/ \
  --dir-save ./data/evaluation/runs/ \
  --device cuda \
  --loglevel info

Evaluation Pipeline Design

The evaluation pipeline is composed of two steps: data preparation and metrics computation, respectively managed by the EvaluationDataset and EvaluationPipeline classes.

EvaluationDataset

The EvaluationDataset class helps creating a custom dataset object suited for metric computation.

The object is instanciated from an existing image folder or a hugging face repo. A dataset ID can be passed as input, by default the id will be computed from the current date and a custom hash of the dataset. When instanciating from a local folder, the following rules must be follow to ensure a proper functioning of the class:

The root folder must contain one subfolder named images and one named labels
The images folder must contain the images files, named with the following convention : *_Y-m-dTH-M-S.jpg, for example: seq_44_sdis-07_brison-200_2024-02-16T16-38-22.jpg
labels folder must contain a label .txt file in the YOLOv8 TXT format for each image with the coordinates of the groundtruth bounding box

dataset
├── images
│   ├── image1.jpg
│   └── image2.jpg
│   └── image2.jpg
├── labels
│   ├── image1.txt
│   └── image2.txt
    └── image2.txt

datapath = "path/to/dataset"
dataset_ID = "dataset_v0"
dataset = EvaluationDataset(datapath, dataset_ID=dataset_ID)

dir-dataset is used to evaluate the model, while dir-temporal-dataset is used to evaluate the engine on sequences of images.

EvaluationPipeline

The EvaluationPipeline class helps launching the evaluation on a given dataset. The evaluation is launched as follows:

evaluation = EvaluationPipeline(dataset=dataset)
evaluation.run()
evaluation.save_metrics()

The complete evaluation is composed of two part : ModelEvaluator, which provides metrics on the model performance alone, and EngineEvaluator which provides metrics on the whole detection pipeline in the PyroEngine.

The object can be instanciated with the following parameters as input:

self.dataset : EvaluationDataset object
self.config : config dictionary as described below
self.run_id : ID of the run, will be generated if not specified
self.use_existing_predictions : if True, we check for existing model predicitons in the predicition folder, each prediction is saved in a json file named after the model hash. Model hash is also saved in a hashfile next to the weight file.

config is a dictionnary that describes the run configuration, if not in the dictionnary, the parameters will take the default values from the Engine and Classifier classes in pyro-engine.

{
    "model_path" : "path/to/model.pt",
    "model" : {
        "iou" : 0,
        "conf" : 0.15,
        "imgsz" : 1024,
    },
    "engine" : {
        "conf_thresh" : 0.15,
        "max_bbox_size" : 0.4,
        "nb_consecutive_frames" : 8,
    },
    "eval" : ["model", "engine"]
}

With the following keys:

nb_consecutive_frames (int): Number of consecutive frames taken into accoun in the Engine
conf_thresh (float in [0.,1.]): Confidence threshold used in the Engine, below which detections are filtered out
conf (float in [0.,1.]): Confidence threshold used in the Classifier, below which detections are filtered out
max_bbox_size (float in [0., 1.]): Bbox size above which detections are filtered out
iou (float in [0., 1.]): IoU threshold to compute matches between detected bboxes
eval (array of strs): Parts of the evaluation pipeline

Launcher configuration

The evaluation can be launched on several configuration at once. launcher.py is used to configure the runs:

configs = [
        {
            "model_path" : "path/to/model_1.pt",
            "engine": {
                "conf_thresh" : 0.1,
            },
        },
        {
            "model_path" : "path/to/model_2.onnx",
            "engine": {
                "max_bbox_size" : 0.12,
            },
            "eval" : ["engine"],
        },
        {
            "model_path" : "path/to/model_3.pt",
            "model" : {
                "iou" : 0,
            },
            "eval" : ["engine"],
        },
    ]

    for config in configs:
        evaluation = EvaluationPipeline(dataset=dataset, config=config, device="mps")
        evaluation.run()
        evaluation.save_metrics()

Results

Metrics are saved in the results folder, in a subdirectory named as the run_ID. The data is stored in a json file with the following content.

The file contains:

model_metrics : result of ModelEvaluator
engine_metrics : result of EngineEvaluator
config : run configuration
dataset : dataset information

Useful definitions

EvaluationDataset()

dataset = EvaluationDataset(datapath):

dataset.sequences: list of image Sequence within the dataset.
dataset.hash: hash of the dataset
dataset.dataframe: pandas DataFrame describing the dataset

Sequence()

Sequence : object that represents a sequence of images.

sequence.images: list of CustomImage objects, corresponding to image belonging to a single sequence
sequence.id: name of the sequence (name of the first image without extension)
sequence.sequence_start: timestamp of the first image of the sequence

CustomImage()

CustomImage: object describing an image

image.path: file path
image.sequence_id: name of the sequence the image belongs to
image.timedelta: time elapsed between the start of the sequence and this image
image.boxes: ground truth coordinates
image.prediction : placeholder to store a prediction
image.timestamp: capture date of the image
image.hash: image hash
image.label: boolean label, True if wildfire present False otherwise
image.name: image name

Name		Name	Last commit message	Last commit date
Latest commit History 159 Commits
.dvc		.dvc
credentials		credentials
data		data
scripts		scripts
src/pyro_eval		src/pyro_eval
.dvcignore		.dvcignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pyro-eval

Context

Installation

Python dependencies

Data dependencies

Scaffolding

Usage

run_evaluation.py

Evaluation Pipeline Design

EvaluationDataset

EvaluationPipeline

Launcher configuration

Results

Useful definitions

EvaluationDataset()

Sequence()

CustomImage()

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

pyronear/pyro-eval

Folders and files

Latest commit

History

Repository files navigation

pyro-eval

Context

Installation

Python dependencies

Data dependencies

Scaffolding

Usage

run_evaluation.py

Evaluation Pipeline Design

EvaluationDataset

EvaluationPipeline

Launcher configuration

Results

Useful definitions

EvaluationDataset()

Sequence()

CustomImage()

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages