A High-Efficient Development Toolkit for Image Segmentation Based on PaddlePaddle.

News

[2023-10-29] 🔥 PaddleSeg v2.9 is released! Check more details in Release Notes.

Support Multi-label segmentation, it procides multi-label segmentation support on a serie of semantic segmetation models.
Release Mobile SAM, faster version of Segment Anything Model.
Support Quant Aware Distillation Training Compression for PP-LiteSeg, PP-MobileSeg, OCRNet, and SegFormer-B0 to improve model inference speed.

[2022-04-11] PaddleSeg v2.8 released Segment Anything Model, an original light-weight semantic segmentation model on mobile devices PP-MobileSeg, QualityInspector v0.5, a full-process solution for industrial quality inspection, and PanopticSeg v0.5, a universal panoptic segmentation solution.
[2022-11-30] PaddleSeg v2.7 released a real-time human matting model PP-MattingV2, a 3D medical image segmentation solution MedicalSegV2, and a real-time semantic segmentation model RTFormer.
[2022-07-20] PaddleSeg v2.6 released a real-time human segmentation SOTA solution PP-HumanSegV2, a stable-version semi-automatic segmentation annotation tool EISeg v1.0, a pseudo label pre-training method PSSL, and the source code of PP-MattingV1.
[2022-04-20] PaddleSeg v2.5 released a real-time semantic segmentation model PP-LiteSeg, a trimap-free image matting model PP-MattingV1, and an easy-to-use solution for 3D medical image segmentation MedicalSegV1.
[2022-01-20] We release PaddleSeg v2.4 with EISeg v0.4, and PP-HumanSegV1 including an open-sourced dataset PP-HumanSeg14K.

Introduction

PaddleSeg is an end-to-end high-efficent development toolkit for image segmentation based on PaddlePaddle, which helps both developers and researchers in the whole process of designing segmentation models, training models, optimizing performance and inference speed, and deploying models. A lot of well-trained models and various real-world applications in both industry and academia help users conveniently build hands-on experiences in image segmentation.

Features

High-Performance Model: Following the state of the art segmentation methods and using high-performance backbone networks, we provide 45+ models and 150+ high-quality pre-training models, which are better than other open-source implementations.
High Efficiency: PaddleSeg provides multi-process asynchronous I/O, multi-card parallel training, evaluation, and other acceleration strategies, combined with the memory optimization function of the PaddlePaddle, which can greatly reduce the training overhead of the segmentation model, all these allowing developers to train image segmentation models more efficiently and at a lower cost.
Modular Design: We build PaddleSeg with the modular design philosophy. Therefore, based on actual application scenarios, developers can assemble diversified training configurations with data augmentation strategies, segmentation models, backbone networks, loss functions, and other different components to meet different performance and accuracy requirements.
Complete Flow: PaddleSeg supports image labeling, model designing, model training, model compression, and model deployment. With the help of PaddleSeg, developers can easily finish all tasks in the entire workflow.

Community

If you have any questions, suggestions or feature requests, please do not hesitate to create an issue in GitHub Issues.
Please scan the following QR code to join PaddleSeg WeChat group to communicate with us:

Overview

Models	Components		Special Cases
Semantic Segmentation PP-LiteSeg PP-MobileSeg DeepLabV3P OCRNet MobileSeg ANN Att U-Net BiSeNetV1 BiSeNetV2 CCNet DANet DDRNet DecoupledSeg DeepLabV3 DMNet DNLNet EMANet ENCNet ENet ESPNetV1 ESPNetV2 FastFCN Fast-SCNN GCNet GINet GloRe GSCNN HarDNet HRNet-FCN HRNet-Contrast ISANet PFPNNet PointRend PotraitNet PP-HumanSeg-Lite PSPNet PSSL SegFormer SegMenter SegNet SETR SFNet STDCSeg U²Net UNet UNet++ UNet3+ UperNet RTFormer UHRNet TopFormer MscaleOCRNet-PSA CAE MaskFormer ViT-Adapter HRFormer LPSNet SegNeXt K-Net Interactive Segmentation EISeg RITM EdgeFlow Image Matting PP-MattingV2 PP-MattingV1 DIM MODNet PP-HumanMatting RVM Panoptic Segmentation Mask2Former Panoptic-DeepLab	Backbones HRNet ResNet STDCNet MobileNetV2 MobileNetV3 ShuffleNetV2 GhostNet LiteHRNet XCeption VIT MixVIT Swin Transformer TopTransformer HRTransformer MSCAN Losses Binary CE Loss Bootstrapped CE Loss Cross Entropy Loss Relax Boundary Loss Detail Aggregate Loss Dice Loss Edge Attention Loss Focal Loss MultiClassFocal Loss GSCNN Dual Task Loss KL Loss L1 Loss Lovasz Loss MSE Loss OHEM CE Loss Pixel Contrast CE Loss Point CE Loss RMI Loss Connectivity Loss Metrics mIoU Accuracy Kappa Dice AUC_ROC	Datasets ADE20K Cityscapes COCO Stuff Pascal VOC EG1800 Pascal Context SUPERVISELY OPTIC DISC SEG CHASE_DB1 HRF DRIVE STARE PP-HumanSeg14K PSSL Data Augmentation Flipping Resize ResizeByLong ResizeByShort LimitLong ResizeRangeScaling ResizeStepScaling Normalize Padding PaddingByAspectRatio RandomPaddingCrop RandomCenterCrop ScalePadding RandomNoise RandomBlur RandomRotation RandomScaleAspect RandomDistort RandomAffine	Segment Anything SegmentAnything Model Selection Tool PaddleSMRT Human Segmentation PP-HumanSegV1 PP-HumanSegV2 MedicalSeg VNet UNETR nnFormer nnUNet-D TransUNet SwinUNet Cityscapes SOTA Model HMSA CVPR Champion Model MLA Transformer Domain Adaptation PixMatch

Industrial Segmentation Models

High Accuracy Semantic Segmentation Models

These models have good performance and costly inference time, so they are designed for GPU and Jetson devices.

Model	Backbone	Cityscapes mIoU(%)	V100 TRT Inference Speed(FPS)	Config File
FCN	HRNet_W18	78.97	24.43	yml
FCN	HRNet_W48	80.70	10.16	yml
DeepLabV3	ResNet50_OS8	79.90	4.56	yml
DeepLabV3	ResNet101_OS8	80.85	3.2	yml
DeepLabV3	ResNet50_OS8	80.36	6.58	yml
DeepLabV3	ResNet101_OS8	81.10	3.94	yml
OCRNet 🌟	HRNet_w18	80.67	13.26	yml
OCRNet	HRNet_w48	82.15	6.17	yml
CCNet	ResNet101_OS8	80.95	3.24	yml

Note that:

We test the inference speed on Nvidia GPU V100. We use PaddleInference Python API with TensorRT enabled. The data type is FP32, and the shape of input tensor is 1x3x1024x2048.

Lightweight Semantic Segmentation Models

The segmentation accuracy and inference speed of these models are medium. They can be deployed on GPU, X86 CPU and ARM CPU.

Model	Backbone	Cityscapes mIoU(%)	V100 TRT Inference Speed(FPS)	Snapdragon 855 Inference Speed(FPS)	Config File
PP-LiteSeg 🌟	STDC1	77.04	69.82	17.22	yml
PP-LiteSeg 🌟	STDC2	79.04	54.53	11.75	yml
BiSeNetV1	-	75.19	14.67	1.53	yml
BiSeNetV2	-	73.19	61.83	13.67	yml
STDCSeg	STDC1	74.74	62.24	14.51	yml
STDCSeg	STDC2	77.60	51.15	10.95	yml
DDRNet_23	-	79.85	42.64	7.68	yml
HarDNet	-	79.03	30.3	5.44	yml
SFNet	ResNet18_OS8	78.72	10.72	-	yml

Note that:

We test the inference speed on Nvidia GPU V100. We use PaddleInference Python API with TensorRT enabled. The data type is FP32, and the shape of input tensor is 1x3x1024x2048.
We test the inference speed on Snapdragon 855. We use PaddleLite CPP API with 1 thread, and the shape of input tensor is 1x3x256x256.

Super Lightweight Semantic Segmentation Models

These super lightweight semantic segmentation models are designed for X86 CPU and ARM CPU.

Model	Backbone	ADE20K mIoU(%)	Snapdragon 855 Inference latency(ms)	params(M)	Links
TopFormer-Base	TopTransformer-Base	38.28	480.6	5.13	config
PP-MobileSeg-Base 🌟	StrideFormer-Base	41.57	265.5	5.62	config
TopFormer-Tiny	TopTransformer-Tiny	32.46	490.3	1.41	config
PP-MobileSeg-Tiny 🌟	StrideFormer-Tiny	36.39	215.3	1.61	config

Note that:

We test the inference speed on Snapdragon 855. We use PaddleLite CPP API with 1 thread, and the shape of input tensor is 1x3x512x512. We test the latency with the final argmax operator on.

Model	Backbone	Cityscapes mIoU(%)	V100 TRT Inference Speed(FPS)	Snapdragon 855 Inference Speed(FPS)	Config File
MobileSeg	MobileNetV2	73.94	67.57	27.01	yml
MobileSeg 🌟	MobileNetV3	73.47	67.39	32.90	yml
MobileSeg	Lite_HRNet_18	70.75	10.5	13.05	yml
MobileSeg	ShuffleNetV2_x1_0	69.46	37.09	39.61	yml
MobileSeg	GhostNet_x1_0	71.88	35.58	38.74	yml

Note that:

We test the inference speed on Nvidia GPU V100. We use PaddleInference Python API with TensorRT enabled. The data type is FP32, and the shape of input tensor is 1x3x1024x2048.
We test the inference speed on Snapdragon 855. We use PaddleLite CPP API with 1 thread, and the shape of input tensor is 1x3x256x256.

Tutorials

Introductory Tutorials

Installation
Quick Start
A 20 minutes Blitz to Learn PaddleSeg
Model Zoo

Basic Tutorials

Data Preparation
Config Preparation
Model Training
Model Evaluation
Model Prediction
Model Export
- Export Inference Model
- Export ONNX Model
Model Deployment

Advanced Tutorials

Training Tricks
Model Compression
FAQ

Welcome to Contribute

API Documention
Advanced Development
- Detailed Configuration File
- Create Your Own Model
Pull Request
- PR Tutorial
- PR Style

Special Features

Interactive Segmentation
Image Matting
PP-HumanSeg
3D Medical Segmentation
Cityscapes SOTA
Panoptic Segmentation
CVPR Champion Solution
Domain Adaptation

Industrial Tutorial Examples

Using PP-HumanSegV2 for Human Segmentation
Using PP-HumanSegV1 for Human Segmentation
Using PP-LiteSeg for Road Segmentation
Using PaddleSeg for Face Parsing and Makeup
Using PaddleSeg for Mini-dataset Spine Segmentation
Using PaddleSeg for Lane Segmentation
PaddleSeg in APIs
Learn Paddleseg in 10 Mins
Application of Interactive Segmentation Technology in Smart Mapping
Nail art preview machine based on PaddleSeg
Overrun monitoring of steel bar length based on PaddleSeg

For more examples, see here.

License

PaddleSeg is released under the Apache 2.0 license.

Acknowledgement

Thanks jm12138 for contributing U²-Net.
Thanks zjhellofss (Fu Shenshen) for contributing Attention U-Net, and Dice Loss.
Thanks liuguoyu666, geoyee for contributing U-Net++ and U-Net3+.
Thanks yazheng0307 (LIU Zheng) for contributing quick-start document.
Thanks CuberrChen for contributing STDC(rethink BiSeNet), PointRend and DetailAggregateLoss.
Thanks stuartchen1949 for contributing SegNet.
Thanks justld (Lang Du) for contributing UPerNet, DDRNet, CCNet, ESPNetV2, DMNet, ENCNet, HRNet_W48_Contrast, FastFCN, BiSeNetV1, SECrossEntropyLoss and PixelContrastCrossEntropyLoss.
Thanks Herman-Hu-saber (Hu Huiming) for contributing ESPNetV2.
Thanks zhangjin12138 for contributing RandomCenterCrop.
Thanks simuler for contributing ESPNetV1.
Thanks ETTR123(Zhang Kai) for contributing ENet, PFPNNet.

Citation

If you find our project useful in your research, please consider citing:

@misc{liu2021paddleseg,
      title={PaddleSeg: A High-Efficient Development Toolkit for Image Segmentation},
      author={Yi Liu and Lutao Chu and Guowei Chen and Zewu Wu and Zeyu Chen and Baohua Lai and Yuying Hao},
      year={2021},
      eprint={2101.06175},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@misc{paddleseg2019,
    title={PaddleSeg, End-to-end image segmentation kit based on PaddlePaddle},
    author={PaddlePaddle Contributors},
    howpublished = {\url{https://github.com/PaddlePaddle/PaddleSeg}},
    year={2019}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_EN.md

README_EN.md

News

Introduction

Features

Community

Overview

Industrial Segmentation Models

These models have good performance and costly inference time, so they are designed for GPU and Jetson devices.

The segmentation accuracy and inference speed of these models are medium. They can be deployed on GPU, X86 CPU and ARM CPU.

These super lightweight semantic segmentation models are designed for X86 CPU and ARM CPU.

Tutorials

Special Features

Industrial Tutorial Examples

License

Acknowledgement

Citation

Files

README_EN.md

Latest commit

History

README_EN.md

File metadata and controls

News

Introduction

Features

Community

Overview

Industrial Segmentation Models

These models have good performance and costly inference time, so they are designed for GPU and Jetson devices.

The segmentation accuracy and inference speed of these models are medium. They can be deployed on GPU, X86 CPU and ARM CPU.

These super lightweight semantic segmentation models are designed for X86 CPU and ARM CPU.

Tutorials

Special Features

Industrial Tutorial Examples

License

Acknowledgement

Citation