Merge pull request opencv#95 from Satgoy152:adding-doc

Improved help messages for demo programs (opencv#95) - Added Demo Documentation - Updated help messages - Changed exception link
kaingwade · Nov 27, 2022 · f7c2881 · f7c2881
1 parent 0b263e4
commit f7c2881
Show file tree

Hide file tree

Showing 26 changed files with 184 additions and 122 deletions.
diff --git a/README.md b/README.md
@@ -3,19 +3,20 @@
 A zoo for models tuned for OpenCV DNN with benchmarks on different platforms.
 
 Guidelines:
+
 - Clone this repo to download all models and demo scripts:
-    ```shell
-    # Install git-lfs from https://git-lfs.github.com/
-    git clone https://github.com/opencv/opencv_zoo && cd opencv_zoo
-    git lfs install
-    git lfs pull
-    ```
+  ```shell
+  # Install git-lfs from https://git-lfs.github.com/
+  git clone https://github.com/opencv/opencv_zoo && cd opencv_zoo
+  git lfs install
+  git lfs pull
+  ```
 - To run benchmarks on your hardware settings, please refer to [benchmark/README](./benchmark/README.md).
 
 ## Models & Benchmark Results
 
-| Model                                                   | Task                          | Input Size | INTEL-CPU (ms) | RPI-CPU (ms) | JETSON-GPU (ms) | KV3-NPU (ms) | D1-CPU (ms) |
-|---------------------------------------------------------|-------------------------------|------------|----------------|--------------|-----------------|--------------|-------------|
+| Model                                                | Task                          | Input Size | INTEL-CPU (ms) | RPI-CPU (ms) | JETSON-GPU (ms) | KV3-NPU (ms) | D1-CPU (ms) |
+| ---------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | ----------- |
 | [YuNet](./models/face_detection_yunet)                  | Face Detection                | 160x120    | 1.45           | 6.22         | 12.18           | 4.04         | 86.69       |
 | [SFace](./models/face_recognition_sface)                | Face Recognition              | 112x112    | 8.65           | 99.20        | 24.88           | 46.25        | ---         |
 | [LPD-YuNet](./models/license_plate_detection_yunet/)    | License Plate Detection       | 320x240    | ---            | 168.03       | 56.12           | 154.20\*     |             |
@@ -36,13 +37,15 @@ Guidelines:
 \*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
 
 Hardware Setup:
+
 - `INTEL-CPU`: [Intel Core i7-5930K](https://www.intel.com/content/www/us/en/products/sku/82931/intel-core-i75930k-processor-15m-cache-up-to-3-70-ghz/specifications.html) @ 3.50GHz, 6 cores, 12 threads.
 - `RPI-CPU`: [Raspberry Pi 4B](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/), Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz.
 - `JETSON-GPU`: [NVIDIA Jetson Nano B01](https://developer.nvidia.com/embedded/jetson-nano-developer-kit), 128-core NVIDIA Maxwell GPU.
 - `KV3-NPU`: [Khadas VIM3](https://www.khadas.com/vim3), 5TOPS Performance. Benchmarks are done using **quantized** models. You will need to compile OpenCV with TIM-VX following [this guide](https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU) to run benchmarks. The test results use the `per-tensor` quantization model by default.
 - `D1-CPU`: [Allwinner D1](https://d1.docs.aw-ol.com/en), [Xuantie C906 CPU](https://www.t-head.cn/product/C906?spm=a2ouz.12986968.0.0.7bfc1384auGNPZ) (RISC-V, RVV 0.7.1) @ 1.0GHz, 1 core. YuNet is supported for now. Visit [here](https://github.com/fengyuentau/opencv_zoo_cpp) for more details.
 
 ***Important Notes***:
+
 - The data under each column of hardware setups on the above table represents the elapsed time of an inference (preprocess, forward and postprocess).
 - The time data is the median of 10 runs after some warmup runs. Different metrics may be applied to some specific models.
 - Batch size is 1 for all benchmark results.
@@ -52,6 +55,7 @@ Hardware Setup:
 ## Some Examples
 
 Some examples are listed below. You can find more in the directory of each model!
+
 ### Face Detection with [YuNet](./models/face_detection_yunet/)
 
 ![largest selfie](./models/face_detection_yunet/examples/largest_selfie.jpg)

diff --git a/models/face_detection_yunet/README.md b/models/face_detection_yunet/README.md
@@ -3,14 +3,15 @@
 YuNet is a light-weight, fast and accurate face detection model, which achieves 0.834(AP_easy), 0.824(AP_medium), 0.708(AP_hard) on the WIDER Face validation set.
 
 Notes:
+
 - Model source: [here](https://github.com/ShiqiYu/libfacedetection.train/blob/a61a428929148171b488f024b5d6774f93cdbc13/tasks/task1/onnx/yunet.onnx).
 - For details on training this model, please visit https://github.com/ShiqiYu/libfacedetection.train.
 - This ONNX model has fixed input shape, but OpenCV DNN infers on the exact shape of input image. See https://github.com/opencv/opencv_zoo/issues/44 for more information.
 
 Results of accuracy evaluation with [tools/eval](../../tools/eval).
 
-| Models      | Easy AP | Medium AP | Hard AP | 
-|-------------|---------|-----------|---------|
+| Models      | Easy AP | Medium AP | Hard AP |
+| ----------- | ------- | --------- | ------- |
 | YuNet       | 0.8498  | 0.8384    | 0.7357  |
 | YuNet quant | 0.7751  | 0.8145    | 0.7312  |
 
@@ -19,11 +20,15 @@ Results of accuracy evaluation with [tools/eval](../../tools/eval).
 ## Demo
 
 Run the following command to try the demo:
+
 ```shell
 # detect on camera input
 python demo.py
 # detect on an image
 python demo.py --input /path/to/image
+
+# get help regarding various parameters
+python demo.py --help
 ```
 
 ### Example outputs

diff --git a/models/face_detection_yunet/demo.py b/models/face_detection_yunet/demo.py
@@ -22,25 +22,25 @@ def str2bool(v):
 backends = [cv.dnn.DNN_BACKEND_OPENCV, cv.dnn.DNN_BACKEND_CUDA]
 targets = [cv.dnn.DNN_TARGET_CPU, cv.dnn.DNN_TARGET_CUDA, cv.dnn.DNN_TARGET_CUDA_FP16]
 help_msg_backends = "Choose one of the computation backends: {:d}: OpenCV implementation (default); {:d}: CUDA"
-help_msg_targets = "Chose one of the target computation devices: {:d}: CPU (default); {:d}: CUDA; {:d}: CUDA fp16"
+help_msg_targets = "Choose one of the target computation devices: {:d}: CPU (default); {:d}: CUDA; {:d}: CUDA fp16"
 try:
     backends += [cv.dnn.DNN_BACKEND_TIMVX]
     targets += [cv.dnn.DNN_TARGET_NPU]
     help_msg_backends += "; {:d}: TIMVX"
     help_msg_targets += "; {:d}: NPU"
 except:
-    print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
+    print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
 
 parser = argparse.ArgumentParser(description='YuNet: A Fast and Accurate CNN-based Face Detector (https://github.com/ShiqiYu/libfacedetection).')
-parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
-parser.add_argument('--model', '-m', type=str, default='face_detection_yunet_2022mar.onnx', help='Path to the model.')
+parser.add_argument('--input', '-i', type=str, help='Usage: Set input to a certain image, omit if using camera.')
+parser.add_argument('--model', '-m', type=str, default='face_detection_yunet_2022mar.onnx', help="Usage: Set model type, defaults to 'face_detection_yunet_2022mar.onnx'.")
 parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
 parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
-parser.add_argument('--conf_threshold', type=float, default=0.9, help='Filter out faces of confidence < conf_threshold.')
-parser.add_argument('--nms_threshold', type=float, default=0.3, help='Suppress bounding boxes of iou >= nms_threshold.')
-parser.add_argument('--top_k', type=int, default=5000, help='Keep top_k bounding boxes before NMS.')
-parser.add_argument('--save', '-s', type=str, default=False, help='Set true to save results. This flag is invalid when using camera.')
-parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
+parser.add_argument('--conf_threshold', type=float, default=0.9, help='Usage: Set the minimum needed confidence for the model to identify a face, defauts to 0.9. Smaller values may result in faster detection, but will limit accuracy. Filter out faces of confidence < conf_threshold.')
+parser.add_argument('--nms_threshold', type=float, default=0.3, help='Usage: Suppress bounding boxes of iou >= nms_threshold. Default = 0.3.')
+parser.add_argument('--top_k', type=int, default=5000, help='Usage: Keep top_k bounding boxes before NMS.')
+parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
+parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
 args = parser.parse_args()
 
 def visualize(image, results, box_color=(0, 255, 0), text_color=(0, 0, 255), fps=None):

diff --git a/models/face_recognition_sface/README.md b/models/face_recognition_sface/README.md
@@ -3,30 +3,33 @@
 SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition
 
 Note:
+
 - SFace is contributed by [Yaoyao Zhong](https://github.com/zhongyy/SFace).
 - [face_recognition_sface_2021sep.onnx](./face_recognition_sface_2021sep.onnx) is converted from the model from https://github.com/zhongyy/SFace thanks to [Chengrui Wang](https://github.com/crywang).
 - Support 5-landmark warpping for now (2021sep)
 
 Results of accuracy evaluation with [tools/eval](../../tools/eval).
 
-| Models      | Accuracy | 
-|-------------|----------|
+| Models      | Accuracy |
+| ----------- | -------- |
 | SFace       | 0.9940   |
 | SFace quant | 0.9932   |
 
 \*: 'quant' stands for 'quantized'.
 
-
 ## Demo
 
 ***NOTE***: This demo uses [../face_detection_yunet](../face_detection_yunet) as face detector, which supports 5-landmark detection for now (2021sep).
 
 Run the following command to try the demo:
+
 ```shell
 # recognize on images
 python demo.py --input1 /path/to/image1 --input2 /path/to/image2
-```
 
+# get help regarding various parameters
+python demo.py --help
+```
 
 ## License
 
@@ -35,4 +38,4 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
 ## Reference
 
 - https://ieeexplore.ieee.org/document/9318547
-- https://github.com/zhongyy/SFace
+- https://github.com/zhongyy/SFace
diff --git a/models/face_recognition_sface/demo.py b/models/face_recognition_sface/demo.py
@@ -25,26 +25,26 @@ def str2bool(v):
 
 backends = [cv.dnn.DNN_BACKEND_OPENCV, cv.dnn.DNN_BACKEND_CUDA]
 targets = [cv.dnn.DNN_TARGET_CPU, cv.dnn.DNN_TARGET_CUDA, cv.dnn.DNN_TARGET_CUDA_FP16]
-help_msg_backends = "Choose one of the computation backends: {:d}: OpenCV implementation (default); {:d}: CUDA"
+help_msg_backends = "Choose one of the computation backends: {:d}: OpenCV implementation (default); {:d}: CUDA \n Usage: Set backend DNN model, defaults to cv.dnn.DNN_BACKEND_OPENCV (int = 0). Based on your OpenCV version, it may or may not support cv.dnn.DNN_BACKEND_TIMVX. More details: [https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f]"
 help_msg_targets = "Chose one of the target computation devices: {:d}: CPU (default); {:d}: CUDA; {:d}: CUDA fp16"
 try:
     backends += [cv.dnn.DNN_BACKEND_TIMVX]
     targets += [cv.dnn.DNN_TARGET_NPU]
     help_msg_backends += "; {:d}: TIMVX"
     help_msg_targets += "; {:d}: NPU"
 except:
-    print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
+    print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
 
 parser = argparse.ArgumentParser(
     description="SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition (https://ieeexplore.ieee.org/document/9318547)")
-parser.add_argument('--input1', '-i1', type=str, help='Path to the input image 1.')
-parser.add_argument('--input2', '-i2', type=str, help='Path to the input image 2.')
-parser.add_argument('--model', '-m', type=str, default='face_recognition_sface_2021dec.onnx', help='Path to the model.')
+parser.add_argument('--input1', '-i1', type=str, help='Usage: Set path to the input image 1 (original face).')
+parser.add_argument('--input2', '-i2', type=str, help='Usage: Set path to the input image 2 (comparison face).')
+parser.add_argument('--model', '-m', type=str, default='face_recognition_sface_2021dec.onnx', help='Usage: Set model path, defaults to face_recognition_sface_2021dec.onnx.')
 parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
 parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
-parser.add_argument('--dis_type', type=int, choices=[0, 1], default=0, help='Distance type. \'0\': cosine, \'1\': norm_l1.')
-parser.add_argument('--save', '-s', type=str, default=False, help='Set true to save results. This flag is invalid when using camera.')
-parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
+parser.add_argument('--dis_type', type=int, choices=[0, 1], default=0, help='Usage: Distance type. \'0\': cosine, \'1\': norm_l1. Defaults to \'0\'')
+parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
+parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
 args = parser.parse_args()
 
 if __name__ == '__main__':

diff --git a/models/handpose_estimation_mediapipe/demo.py b/models/handpose_estimation_mediapipe/demo.py
@@ -27,7 +27,7 @@ def str2bool(v):
     help_msg_backends += "; {:d}: TIMVX"
     help_msg_targets += "; {:d}: NPU"
 except:
-    print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
+    print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
 
 parser = argparse.ArgumentParser(description='Hand Pose Estimation from MediaPipe')
 parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')

diff --git a/models/human_segmentation_pphumanseg/README.md b/models/human_segmentation_pphumanseg/README.md
@@ -5,14 +5,18 @@ This model is ported from [PaddleHub](https://github.com/PaddlePaddle/PaddleHub)
 ## Demo
 
 Run the following command to try the demo:
+
 ```shell
 # detect on camera input
 python demo.py
 # detect on an image
 python demo.py --input /path/to/image
+
+# get help regarding various parameters
+python demo.py --help
 ```
 
-## Example outputs
+### Example outputs
 
 ![webcam demo](./examples/pphumanseg_demo.gif)
 
@@ -26,4 +30,4 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
 
 - https://arxiv.org/abs/1512.03385
 - https://github.com/opencv/opencv/tree/master/samples/dnn/dnn_model_runner/dnn_conversion/paddlepaddle
-- https://github.com/PaddlePaddle/PaddleHub
+- https://github.com/PaddlePaddle/PaddleHub
diff --git a/models/human_segmentation_pphumanseg/demo.py b/models/human_segmentation_pphumanseg/demo.py
@@ -29,15 +29,15 @@ def str2bool(v):
     help_msg_backends += "; {:d}: TIMVX"
     help_msg_targets += "; {:d}: NPU"
 except:
-    print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
+    print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
 
 parser = argparse.ArgumentParser(description='PPHumanSeg (https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.2/contrib/PP-HumanSeg)')
-parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
-parser.add_argument('--model', '-m', type=str, default='human_segmentation_pphumanseg_2021oct.onnx', help='Path to the model.')
+parser.add_argument('--input', '-i', type=str, help='Usage: Set input path to a certain image, omit if using camera.')
+parser.add_argument('--model', '-m', type=str, default='human_segmentation_pphumanseg_2021oct.onnx', help='Usage: Set model path, defaults to human_segmentation_pphumanseg_2021oct.onnx.')
 parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
 parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
-parser.add_argument('--save', '-s', type=str, default=False, help='Set true to save results. This flag is invalid when using camera.')
-parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
+parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save a file with results. Invalid in case of camera input. Default will be set to “False”.')
+parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
 args = parser.parse_args()
 
 def get_color_map_list(num_classes):

diff --git a/models/image_classification_mobilenet/README.md b/models/image_classification_mobilenet/README.md
@@ -6,23 +6,27 @@ MobileNetV2: Inverted Residuals and Linear Bottlenecks
 
 Results of accuracy evaluation with [tools/eval](../../tools/eval).
 
-| Models | Top-1 Accuracy | Top-5 Accuracy |
-| ------ | -------------- | -------------- |
-| MobileNet V1 | 67.64 | 87.97 |
-| MobileNet V1 quant | 55.53 | 78.74 |
-| MobileNet V2 | 69.44 | 89.23 |
-| MobileNet V2 quant | 68.37 | 88.56 |
+| Models             | Top-1 Accuracy | Top-5 Accuracy |
+| ------------------ | -------------- | -------------- |
+| MobileNet V1       | 67.64          | 87.97          |
+| MobileNet V1 quant | 55.53          | 78.74          |
+| MobileNet V2       | 69.44          | 89.23          |
+| MobileNet V2 quant | 68.37          | 88.56          |
 
 \*: 'quant' stands for 'quantized'.
 
 ## Demo
 
 Run the following command to try the demo:
+
 ```shell
 # MobileNet V1
 python demo.py --input /path/to/image
 # MobileNet V2
 python demo.py --input /path/to/image --model v2
+
+# get help regarding various parameters
+python demo.py --help
 ```
 
 ## License
@@ -35,4 +39,3 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
 - MobileNet V2: https://arxiv.org/abs/1801.04381
 - MobileNet V1 weight and scripts for training: https://github.com/wjc852456/pytorch-mobilenet-v1
 - MobileNet V2 weight: https://github.com/onnx/models/tree/main/vision/classification/mobilenet
-