From 7070028a8585644fab75a83e49948bff5509641b Mon Sep 17 00:00:00 2001 From: Sungman Cho Date: Tue, 26 Mar 2024 11:16:34 +0900 Subject: [PATCH] Fix incorrect information for the classification tasks (#3191) * Fix incorrect information * Reflect review --- .../classification/hierarhical_classification.rst | 4 ++-- .../classification/multi_class_classification.rst | 9 ++------- .../classification/multi_label_classification.rst | 15 ++++----------- 3 files changed, 8 insertions(+), 20 deletions(-) diff --git a/docs/source/guide/explanation/algorithms/classification/hierarhical_classification.rst b/docs/source/guide/explanation/algorithms/classification/hierarhical_classification.rst index f6c50a2297a..ecc32eb8a65 100644 --- a/docs/source/guide/explanation/algorithms/classification/hierarhical_classification.rst +++ b/docs/source/guide/explanation/algorithms/classification/hierarhical_classification.rst @@ -24,7 +24,7 @@ Assume, we have a label tree as below: The goal of our algorithm is to return the right branch of this tree. For example: ``Persian -> Cats -> Pets`` -At the inference stage, we traverse the tree from head to leaves and obtain labels predicted by the corresponding classifier. +At the training / inference stage, we traverse the tree from head to leaves and obtain labels predicted by the corresponding classifier. Let's say, we forward an image with the label tree pictured above. On the first level, our corresponding classifier returns 3 predictions. @@ -39,7 +39,7 @@ Dataset Format .. _hierarchical_dataset: For hierarchical image classification, we created our custom dataset format that is supported by `Datumaro `_. -An example of the annotations format and dataset structure can be found in our `sample `_. +An example of the annotations format and dataset structure can be found in our `sample `_. .. note:: diff --git a/docs/source/guide/explanation/algorithms/classification/multi_class_classification.rst b/docs/source/guide/explanation/algorithms/classification/multi_class_classification.rst index 93ec1fbce23..e5ec6c7211d 100644 --- a/docs/source/guide/explanation/algorithms/classification/multi_class_classification.rst +++ b/docs/source/guide/explanation/algorithms/classification/multi_class_classification.rst @@ -6,18 +6,13 @@ For the supervised training we use the following algorithms components: .. _mcl_cls_supervised_pipeline: -- ``Augmentations``: Besides basic augmentations like random flip and random rotate, we use `Augmix `_. This advanced type of augmentations helps to significantly expand the training distribution. - -- ``Optimizer``: `Sharpness Aware Minimization (SAM) `_. Wrapper upon the `SGD `_ optimizer that helps to achieve better generalization minimizing simultaneously loss value and loss sharpness. - -- ``Learning rate schedule``: `Cosine Annealing `_. It is a common learning rate scheduler that tends to work well on average for this task on a variety of different datasets. +- ``Learning rate schedule``: `ReduceLROnPlateau `_. It is a common learning rate scheduler that tends to work well on average for this task on a variety of different datasets. - ``Loss function``: We use standard `Cross Entropy Loss `_ to train a model. However, for the class-incremental scenario we use `Influence-Balanced Loss `_. IB loss is a solution for the class imbalance, which avoids overfitting to the majority classes re-weighting the influential samples. - ``Additional training techniques`` - - `No Bias Decay (NBD) `_: To add adaptability to the training pipeline and prevent overfitting. - ``Early stopping``: To add adaptability to the training pipeline and prevent overfitting. - - `Balanced Sampler `_: To create an efficient batch that consists of balanced samples over classes, reducing the iteration size as well. + - `Balanced Sampler `_: To create an efficient batch that consists of balanced samples over classes, reducing the iteration size as well. ************** Dataset Format diff --git a/docs/source/guide/explanation/algorithms/classification/multi_label_classification.rst b/docs/source/guide/explanation/algorithms/classification/multi_label_classification.rst index 47f651492ad..0b32dbbaf1d 100644 --- a/docs/source/guide/explanation/algorithms/classification/multi_label_classification.rst +++ b/docs/source/guide/explanation/algorithms/classification/multi_label_classification.rst @@ -9,30 +9,23 @@ We solve this problem by optimizing small binary classification sub-tasks aimed For supervised learning we use the following algorithms components: -- ``Augmentations``: Besides basic augmentations like random flip and random rotate, we use `Augmix `_. This advanced type of augmentation helps to significantly expand the training distribution. - -- ``Optimizer``: `Sharpness Aware Minimization (SAM) `_. Wrapper upon the `SGD `_ optimizer that helps to achieve better generalization minimizing simultaneously loss value and loss sharpness. - -- ``Learning rate schedule``: `One Cycle Learning Rate policy `_. It is the combination of gradually increasing the learning rate and gradually decreasing the momentum during the first half of the cycle, then gradually decreasing the learning rate and increasing the momentum during the latter half of the cycle. +- ``Learning rate schedule``: `ReduceLROnPlateau `_. It is a common learning rate scheduler that tends to work well on average for this task on a variety of different datasets. - ``Loss function``: We use **Asymmetric Angular Margin Loss**. We can formulate this loss as follows: :math:`L_j (cos\Theta_j,y) = \frac{k}{s}y p_-^{\gamma^-}\log{p_+} + \frac{1-k}{s}(1-y)p_+^{\gamma^+}\log{p_-}`, where :math:`s` is a scale parameter, :math:`m` is an angular margin, :math:`k` is negative-positive weighting coefficient, :math:`\gamma^+` and :math:`\gamma^-` are weighting parameters. For further information about loss function, ablation studies, and experiments, please refer to our dedicated `paper `_. -- Additionally, we use the `No Bias Decay (NBD) `_ technique, **Exponential Moving Average (EMA)** for the model's weights and adaptive **early stopping** to add adaptability and prevent overfitting. +- Additionally, we use the **early stopping** to add adaptability and prevent overfitting. ************** Dataset Format ************** -As it is a common practice to use object detection datasets in the academic area, we support the most popular object detection format: `COCO `_. -Specifically, this format should be converted in our `internal representation `_. +The format should be converted in our `internal representation `_. .. note:: - Names of the annotations files and overall dataset structure should be the same as the original `COCO `_. You need to convert train and validation sets separately. + Names of the annotations files and overall dataset structure should be the same as above example. You need to convert train and validation sets separately. Please, refer to our :doc:`dedicated tutorial <../../../tutorials/base/how_to_train/classification>` for more information how to train, validate and optimize classification models. -.. note:: - For now, "___" is a symbol to distinguish the multi-label format. So, it must be included at the front of the label name. ****** Models