Skip to content

Releases: plantnet/malpolon

v2.1.2

22 Nov 14:22
Compare
Choose a tag to compare

Main changes

  • Updated all examples to fix inference path issue. Previously, PyTorch Lightning overwrote checkpoint_paths with the values contained in the saved checkpoint files. However, those provided by the Malpolon team for pure inference purposes, contained absolute path incompatible with other people's machines. Now, only relative paths are stored.
  • Updated all examples to prevent downloading the model weights twice when running models in inference mode
  • Updated URL and md5 checksum signature to download glc24_pre_extracted pre-trained weights

Other changes

  • Added Malpolon QR code in project resources

v2.1.1

14 Nov 18:23
4687bf1
Compare
Choose a tag to compare

What's changed

Main changes

  • Updated link to glc24_pre_extracted model weights, addressing pos_weight loading issue following previous Malpolon updates.

v2.1.0

06 Nov 19:48
Compare
Choose a tag to compare

What's changed

Main changes

  • Added possibility for users to choose their optimizer and scheduler via their config file:
    • malpolon.models.utils: Changed behavior of check_optimizer() and added check_scheduler() to allow users to input one or several optimizers (and optionally 1 scheduler per optimizer, possibly with a lr_scheduler_config descriptor) via their config files.
    • malpolon.models.standard_prediction_systems: changed instantiation of optimizer(s) and scheduler(s) in class GenericPredictionSystem. The class attributes are now lists of instantiated optimizers (respectively, of lr_scheduler_config dictionaries). Updated behavior of method configure_optimizers() to return a dictionary containing all the optimizers and schedulers (cf. https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.core.LightningModule.html#lightning.pytorch.core.LightningModule.configure_optimizers).
    • Updated all examples and added all corresponding unit tests, testing both valid scenarios and edge cases of incorrect user inputs in the config file.

Others

  • In glc24_pre_extracted example: added habitat version dataset which consists of symbolic links to the species version. Running the habitat's main script will trigger the download of the data predictors (rasters, satellite, time-series).
  • Updated split_obs_per_species_frequency() to include more input arguments

v2.0.0

08 Sep 23:50
Compare
Choose a tag to compare

What's Changed

Main changes

  • Added GLC24 pre_extracted habitat dataset and example (see PR 58 in the Links section)

  • Changed the way checkpoints are loaded from loading the state_dict of the model object to loading the state_dict of the LightningModule. This is a breaking change as examples needed to be updated by removing the replacement of "model." string in the loaded state_dict.

  • Added possibility to download model weights for any Malpolon model given a URL and a few file paths

  • Updated the way checkpoint_path is passed on to models. Added an attribute checkpoint_path for all Malpolon models

    • Updated every examples consequently
  • Added Malpolon as (local) model provider.

    • Created new module malpolon.models.custom_models which will host custom models proposed by Malpolon
    • Split classes from geolifeclef2024_multimodal_ensemble.py to glc2024_multimodal_ensemble_model.py and glc2024_pre_extracted_prediction_system.py in custom_models to prevent circular import from malpolon.models.model_builder after adding Malpolon as (local) provider

Others

  • Updated malpolon.data.data_module.export_predict_csv to enable more flexibility when outputting the prediction CSV for a single data point.

  • Added GLC24 pre-extracted examples (habitat and species) using the MultiModalEnsemble (MME) model

    • Automatic download of the dataset from Kaggle (depending on the value of boolean config parameter data.download_data)
    • Automatic download of the model weights from Seafile if not already on disk, via a new model.model_kwargs.pretrained key in the config file. The weights enable users to directly run our MME model on our GLC24_pre_extracted Test set and reach ~30% micro F1-score with ~26% micro precision and ~36% micro Recall, as well as ~96% micro AuC.
  • Added and updated unit tests for GLC24 pre-extracted examples (habitat and species)

  • Added new content in online documentation and tutorial files

Full Changelog: v1.3.0...v2.0.0

v1.3.0

13 Aug 09:56
d91e323
Compare
Choose a tag to compare

What's Changed

Main changes

  • Created new module malpolon.models.custom_models which will host custom models proposed by Malpolon

    • Split datamodule and model from geolifeclef2024_multimodal_ensemble.py to glc2024_multimodal_ensemble_model.py and glc2024_pre_extracted_prediction_system.py in custom_models.
  • Added malpolon as model provider. Currently we only provide MultiModalEnsemble (MME) model which can be called for in config files "model_name" key as: glc24_multimodal_ensemble (see repository examples/benchmark/geolifeclef/geolifeclef2024_pre_extracted/config/glc24_cnn_multimodal_ensemble.yaml)

  • Added possibility to download model weights for any Malpolon model given a URL and a few file paths via malpolon.standard_prediction_system.download_weights

    • Added model weight download info for the MME model. The example experiment file of MME now automatically downloads the weights from Seafile if not already on disk, via model.model_kwargs.pretrained key in the config file
  • Updated the way checkpoint_path is passed on to models. Added an attribute checkpoint_path for all Malpolon models

    • Updated every examples consequently

Others

  • MME: changed the way loss parameter loss.pos_weight is used in the model's _step() method so that its state_dict object stays the same before and after running the model in train mode.

  • GLC22 examples in benchmark and custom_train have been updated to include an inference run option. This led to changing the return values of the class getter for the test dataset. The class now always return a {data, label} pair, with label of value -1 for test dataset (inference run)

    • Updated malpolon/tests/test_examples.py accordingly

v1.2.1

30 Jul 13:25
Compare
Choose a tag to compare

Changes

  • Fixed models import from malpolon.models

Other

  • Purged poisoned PyPi package from unwanted dev files

v1.2.0

29 Jul 17:47
Compare
Choose a tag to compare

New features

  • Datasets

    • Added a new dataset geolifeclef2024_pre_extracted following 2024 edition of Kaggle challenge GeoLifeCLEF
      • Computed rolling mean and rolling std values of GeoLifeCLEF2024 dataset for each modality. These values are stored in this dataset's transform functions
  • Models

    • Added a new model "MultimodalEnsemble" in geolifeclef2024_multimodal_ensemble based on @picekl work on GeoLifeCLEF2024
  • Scripts

    • Added new scripts split_obs_spatially.py, sort_files_glc_fashion.sh
      • split_obs_spatially.py: splits a CSV observation dataset into a training and a val subset where val observation plots are spatially separated from training ones. This scripts uses new verde package.
      • sort_files_glc_fashion.sh:

        This script re-organizes files in one folder into folders and sub-folders in the same way as for the GeoLifeCLEF challenge.
        That is to say in the following manner.

        Each file is re-arranged in folders and sub-folders in the following way:
        A file named 'ABCDWXYZ.pt' located at 'root_path/' will be moved to
        'root_path/YZ/WX/ABCDWXYZ.pt'.

        Each file name must be at least 3 characters long. For instance:
        A file named 'XYZ.pt' located at 'root_path/' will be moved to
        'root_path/YZ/X/XYZ.pt'.

      • split_obs_per_species_frequency: splits a CSV observation dataset into a training and a val subset based on species frequency
    • Added split_obs_spatially.py and split_obs_per_species_frequency.py scripts to Malpolon as modules in malpolon.data.utils

Changes

  • Renamed scripts folder to toolbox
  • Renamed scenarios from {"Ecologists", "Inference", "Kaggle"} to {"Custom_train", "Inference", "Benchmarks"} and re-organized experiments
  • Fixed examples-related bugs, file links, duplicate files and cleaned config files
  • Updated code documentation, repository READMEs and examples tutorial files

v1.1.0

04 Jun 09:30
43f07b8
Compare
Choose a tag to compare

New features

  • New dataset ConcatPatchRasterDataset to handle both satellite image patches and geolocalized rasters in the same model
    • Added example using this new dataset
  • Added standalone scripts
    • crop_rasters.py: This script crops a window from raster files based on coordinates and outputs it as a new file.
    • split_obs_per_species_frequency.py: This script splits an obs csv in val/train based on the frequency of occurrences in the whole dataset. It does NOT perform a spatial split.
    • split_obs_spatially.py: This script splits an obs csv in val/train based on the observations' geographic locations using the Verde package.
    • sort_files_glc_fashion.sh:
      This script re-organizes files in one folder into folders and sub-folders in the same way as for the GeoLifeCLEF challenge. That is to say in the following manner.
      Each file is re-arranged in folders and sub-folders in the following way:
      A file named 'ABCDWXYZ.pt' located at 'root_path/' will be moved to 'root_path/YZ/WX/ABCDWXYZ.pt'.
      Each file name must be at least 3 characters long. For instance:
      A file named 'XYZ.pt' located at 'root_path/' will be moved to 'root_path/YZ/X/XYZ.pt'.
  • Added CIFAR-10 example

Changes

  • Harmonized datasets class arguments and kwargs
  • Reduced examples config files values redundancy by using variable interpolation
  • Changed metric logging parameters for tensorboard logger to include more details
  • Fixed multilabel inference export for test_dataset

v1.0.3

04 Mar 11:15
Compare
Choose a tag to compare

First release of Malpolon's framework.

Try it out now !
https://pypi.org/project/malpolon/

(Versions 1.0.0 to 1.0.2 do not exist)