Skip to content

Commit 3a11f3d

Browse files
authored
Release v1.3.1 of NNCF on Github
1 parent 8c0647e commit 3a11f3d

File tree

627 files changed

+128738
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

627 files changed

+128738
-0
lines changed

.gitattributes

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
*.png filter=lfs diff=lfs merge=lfs -text

.gitignore

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# C extensions
7+
*.so
8+
9+
# Distribution / packaging
10+
.Python
11+
env/
12+
build/
13+
develop-eggs/
14+
dist/
15+
downloads/
16+
eggs/
17+
.eggs/
18+
lib/
19+
lib64/
20+
parts/
21+
sdist/
22+
var/
23+
wheels/
24+
*.egg-info/
25+
.installed.cfg
26+
*.egg
27+
28+
# PyInstaller
29+
# Usually these files are written by a python script from a template
30+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
31+
*.manifest
32+
*.spec
33+
34+
# Installer logs
35+
pip-log.txt
36+
pip-delete-this-directory.txt
37+
38+
# Unit test / coverage reports
39+
htmlcov/
40+
results.html
41+
.tox/
42+
.coverage
43+
.coverage.*
44+
.cache
45+
nosetests.xml
46+
coverage.xml
47+
*.cover
48+
.hypothesis/
49+
50+
# Translations
51+
*.mo
52+
*.pot
53+
54+
# Django stuff:
55+
*.log
56+
local_settings.py
57+
58+
# Flask stuff:
59+
instance/
60+
.webassets-cache
61+
62+
# Scrapy stuff:
63+
.scrapy
64+
65+
# Sphinx documentation
66+
docs/_build/
67+
68+
# PyBuilder
69+
target/
70+
71+
# Jupyter Notebook
72+
.ipynb_checkpoints
73+
74+
# pyenv
75+
.python-version
76+
77+
# celery beat schedule file
78+
celerybeat-schedule
79+
80+
# SageMath parsed files
81+
*.sage.py
82+
83+
# dotenv
84+
.env
85+
86+
# virtualenv
87+
.venv
88+
venv/
89+
ENV/
90+
91+
# Spyder project settings
92+
.spyderproject
93+
.spyproject
94+
95+
# Rope project settings
96+
.ropeproject
97+
98+
# mkdocs documentation
99+
/site
100+
101+
# mypy
102+
.mypy_cache/
103+
104+
# PyCharm
105+
.idea
106+
107+
# snapshots
108+
*.tar
109+
110+
# object detection eval results
111+
examples/object_detection/eval/

.pylintrc

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
[MASTER]
2+
disable = arguments-differ,
3+
cell-var-from-loop,
4+
fixme,
5+
global-statement,
6+
invalid-name,
7+
logging-format-interpolation,
8+
missing-docstring,
9+
no-self-use,
10+
not-callable,
11+
too-few-public-methods,
12+
too-many-arguments,
13+
too-many-instance-attributes,
14+
too-many-locals,
15+
unbalanced-tuple-unpacking,
16+
ungrouped-imports,
17+
unpacking-non-sequence,
18+
unused-argument,
19+
wrong-import-order,
20+
attribute-defined-outside-init,
21+
import-outside-toplevel
22+
23+
max-line-length = 120
24+
ignore-docstrings = yes
25+
ignored-modules = numpy,torch,cv2,openvino
26+
extension-pkg-whitelist = torch,cv2
27+
28+
[SIMILARITIES]
29+
ignore-imports = yes
30+
31+
[BASIC]
32+
bad-functions = print
33+
good-names = logger,fn
34+
35+
[DESIGN]
36+
max-statements=60
37+
max-branches=13

README.md

Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
# Neural Network Compression Framework (NNCF)
2+
3+
This module contains a PyTorch\*-based framework and samples for neural networks compression. The framework is organized as a Python\* package that can be built and used in a standalone mode. The framework architecture is unified to make it easy to add different compression methods. The samples demonstrate the usage of compression algorithms for three different use cases on public models and datasets: Image Classification, Object Detection and Semantic Segmentation.
4+
5+
## Key Features
6+
7+
- Support of various compression algorithms, applied during a model fine-tuning process to achieve best compression parameters and accuracy:
8+
- [Quantization](./docs/compression_algorithms/Quantization.md)
9+
- [Binarization](./docs/compression_algorithms/Binarization.md)
10+
- [Sparsity](./docs/compression_algorithms/Sparsity.md)
11+
- [Filter pruning](./docs/compression_algorithms/Pruning.md)
12+
- Automatic, configurable model graph transformation to obtain the compressed model. The source model is wrapped by the custom class and additional compression-specific layers are inserted in the graph.
13+
- Common interface for compression methods
14+
- GPU-accelerated layers for faster compressed model fine-tuning
15+
- Distributed training support
16+
- Configuration file examples for each supported compression algorithm.
17+
- Git patches for prominent third-party repositories ([mmdetection](https://github.com/open-mmlab/mmdetection), [huggingface-transformers](https://github.com/huggingface/transformers)) demonstrating the process of integrating NNCF into custom training pipelines
18+
- Exporting compressed models to ONNX\* checkpoints ready for usage with [OpenVINO™ toolkit](https://github.com/opencv/dldt).
19+
20+
## Usage
21+
The NNCF is organized as a regular Python package that can be imported in your target training pipeline script.
22+
The basic workflow is loading a JSON configuration script containing NNCF-specific parameters determining the compression to be applied to your model, and then passing your model along with the configuration script to the `nncf.create_compressed_model` function.
23+
This function returns a wrapped model ready for compression fine-tuning, and handle to the object allowing you to control the compression during the training process:
24+
25+
```python
26+
import nncf
27+
from nncf import create_compressed_model, Config as NNCFConfig
28+
29+
# Instantiate your uncompressed model
30+
from torchvision.models.resnet import resnet50
31+
model = resnet50()
32+
33+
# Load a configuration file to specify compression
34+
nncf_config = NNCFConfig.from_json("resnet50_int8.json")
35+
36+
# Provide data loaders for compression algorithm initialization, if necessary
37+
nncf_config = register_default_init_args(nncf_config, loss_criterion, train_loader)
38+
39+
# Apply the specified compression algorithms to the model
40+
comp_ctrl, compressed_model = create_compressed_model(model, nncf_config)
41+
42+
# Now use compressed_model as a usual torch.nn.Module to fine-tune compression parameters along with the model weights
43+
44+
# ... the rest of the usual PyTorch-powered training pipeline
45+
46+
# Export to ONNX or .pth when done fine-tuning
47+
comp_ctrl.export_model("compressed_model.onnx")
48+
torch.save(compressed_model.state_dict(), "compressed_model.pth")
49+
```
50+
51+
For a more detailed description of NNCF usage in your training code, see [Usage.md](./docs/Usage.md). For in-depth examples of NNCF integration, browse the [sample scripts](#Model Compression Samples) code, or the [example patches](#Third-party repository integration) to third-party repositories.
52+
53+
For more details about the framework architecture, refer to the [NNCFArchitecture.md](./docs/NNCFArchitecture.md).
54+
55+
56+
### Model Compression Samples
57+
58+
For a quicker start with NNCF-powered compression, you can also try the sample scripts, each of which provides a basic training pipeline for classification, semantic segmentation and object detection neural network training correspondingly.
59+
60+
To run the samples please refer to the corresponding tutorials:
61+
- [Image Classification sample](examples/classification/README.md)
62+
- [Object Detection sample](examples/object_detection/README.md)
63+
- [Semantic Segmentation sample](examples/semantic_segmentation/README.md)
64+
65+
### Third-party repository integration
66+
NNCF may be straightforwardly integrated into training/evaluation pipelines of third-party repositories.
67+
See [third_party_integration](./third_party_integration) for examples of code modifications (Git patches and base commit IDs are provided) that are necessary to integrate NNCF into select repositories.
68+
69+
70+
### System requirements
71+
- Ubuntu\* 16.04 or later (64-bit)
72+
- Python\* 3.6 or later
73+
- NVidia CUDA\* Toolkit 10.2 or later
74+
- PyTorch\* 1.5 or higher.
75+
76+
### Installation
77+
We suggest to install or use the package in the [Python virtual environment](https://docs.python.org/3/tutorial/venv.html).
78+
79+
#### As a package built from checked-out repository:
80+
1) Install the following system dependencies:
81+
82+
`sudo apt-get install python3-dev`
83+
84+
2) Install the package and its dependencies by running the following in the repository root directory:
85+
86+
- For CPU & GPU-powered execution:
87+
`python setup.py install`
88+
- For CPU-only installation
89+
`python setup.py install --cpu-only`
90+
91+
#### As a Docker image
92+
Use one of the Dockerfiles in the [docker](./docker) directory to build an image with an environment already set up and ready for running NNCF [sample scripts](#Model Compression Samples).
93+
94+
95+
## NNCF compression results
96+
97+
Achieved using sample scripts and NNCF configuration files provided with this repository. See README.md files for [sample scripts](#Model Compression Samples) for links to exact configuration files and final PyTorch checkpoints.
98+
99+
100+
|Model|Compression algorithm|Dataset|PyTorch FP32 baseline|PyTorch compressed accuracy|
101+
| :---: | :---: | :---: | :---: | :---: |
102+
|ResNet-50|None|ImageNet|-|76.13|
103+
|ResNet-50|INT8|ImageNet|76.13|76.05|
104+
|ResNet-50|Mixed, 44.8% INT8 / 55.2% INT4|ImageNet|76.13|76.3|
105+
|ResNet-50|INT8 + Sparsity 61% (RB)|ImageNet|76.13|75.28|
106+
|ResNet-50|Filter pruning, 30%, magnitude criterion|ImageNet|76.13|75.7|
107+
|ResNet-50|Filter pruning, 30%, geometric median criterion|ImageNet|76.13|75.7|
108+
|Inception V3|None|ImageNet|-|77.32|
109+
|Inception V3|INT8|ImageNet|77.32|76.92|
110+
|Inception V3|INT8 + Sparsity 61% (RB)|ImageNet|77.32|76.98|
111+
|MobileNet V2|None|ImageNet|-|71.81|
112+
|MobileNet V2|INT8|ImageNet|71.81|71.34|
113+
|MobileNet V2|Mixed, 46.6% INT8 / 53.4% INT4|ImageNet|71.81|70.89|
114+
|MobileNet V2|INT8 + Sparsity 52% (RB)|ImageNet|71.81|70.99|
115+
|SqueezeNet V1.1|None|ImageNet|-|58.18|
116+
|SqueezeNet V1.1|INT8|ImageNet|58.18|58.02|
117+
|SqueezeNet V1.1|Mixed, 54.7% INT8 / 45.3% INT4|ImageNet|58.18|58.84|
118+
|ResNet-18|None|ImageNet|-|69.76|
119+
|ResNet-18|XNOR (weights), scale/threshold (activations)|ImageNet|69.76|61.61|
120+
|ResNet-18|DoReFa (weights), scale/threshold (activations)|ImageNet|69.76|61.59|
121+
|ResNet-18|Filter pruning, 30%, magnitude criterion|ImageNet|69.76|68.69|
122+
|ResNet-18|Filter pruning, 30%, geometric median criterion|ImageNet|69.76|68.97|
123+
|ResNet-34|None|ImageNet|-|73.31|
124+
|ResNet-34|Filter pruning, 30%, magnitude criterion|ImageNet|73.31|72.54|
125+
|ResNet-34|Filter pruning, 30%, geometric median criterion|ImageNet|73.31|72.60|
126+
|SSD300-BN|None|VOC12+07|-|78.28|
127+
|SSD300-BN|INT8|VOC12+07|78.28|78.07|
128+
|SSD300-BN|INT8 + Sparsity 70% (Magnitude)|VOC12+07|78.28|78.01|
129+
|SSD512-BN|None|VOC12+07|-|80.26|
130+
|SSD512-BN|INT8|VOC12+07|80.26|80.02|
131+
|SSD512-BN|INT8 + Sparsity 70% (Magnitude)|VOC12+07|80.26|79.98|
132+
|UNet|None|CamVid|-|71.95|
133+
|UNet|INT8|CamVid|71.95|71.66|
134+
|UNet|INT8 + Sparsity 60% (Magnitude)|CamVid|71.95|71.72|
135+
|ICNet|None|CamVid|-|67.89|
136+
|ICNet|INT8|CamVid|67.89|67.87|
137+
|ICNet|INT8 + Sparsity 60% (Magnitude)|CamVid|67.89|67.24|
138+
|UNet|None|Mapillary|-|56.23|
139+
|UNet|INT8|Mapillary|56.23|56.12|
140+
|UNet|INT8 + Sparsity 60% (Magnitude)|Mapillary|56.23|56.0|
141+

ReleaseNotes.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Release Notes
2+
3+
## Introduction
4+
*Neural Network Compression Framework (NNCF)* is a toolset for Neural Networks model compression.
5+
The framework organized as a Python module that can be built and used as standalone or within
6+
samples distributed with the code. The samples demonstrate the usage of compression methods on
7+
public models and datasets for three different use cases: Image Classification, Object Detection,
8+
and Semantic Segmentation.
9+
10+
## New in Release 1.3.1
11+
- Now using PyTorch 1.5 and CUDA 10.2 by default
12+
- Support for exporting quantized models to ONNX checkpoints with standard ONNX v10 QuantizeLinear/DequantizeLinear pairs (8-bit quantization only)
13+
- Compression algorithm initialization moved to the compressed model creation stage
14+
15+
## New in Release 1.3:
16+
- Filter pruning algorithm added
17+
- Mixed precision quantization with manual and automatic (HAWQ-powered) precision setup
18+
- Support for DistilBERT
19+
- Selecting quantization parameters based on hardware configuration preset (CPU, GPU, VPU)
20+
- Propagation-based quantizer position setup mode (quantizers are position as early in the network control flow graph as possible while keeping inputs of target operation quantized)
21+
- Improved model graph tracing with introduction of input nodes and intermediate tensor shape tracking
22+
- Updated third-party integration patches for consistency with NNCF release v1.3
23+
- CPU-only installation mode for execution on machines without CUDA GPU hardware installed
24+
- Docker images supplied for easier setup in container-based environments
25+
- Usability improvements (NNCF config .JSON file validation by schema, less boilerplate code, separate logging and others)
26+
27+
## New in Release 1.2:
28+
- Support for transformer-based networks quantization (tested on BERT and RoBERTa)
29+
- Added instructions and Git patches for integrating NNCF into third-party repositories ([mmdetection](https://github.com/open-mmlab/mmdetection), [transformers](https://github.com/huggingface/transformers))
30+
- Support for GNMT quantization
31+
- Regular expression format support for specifying ignored/target scopes in config files - prefix the regex-enabled scope with {re}
32+
33+
## New in Release 1.1
34+
35+
- Binary networks using XNOR and DoReFa methods
36+
- Asymmetric quantization scheme and per-channel quantization of Convolution
37+
- 3D models support
38+
- Support of integration into the [mmdetection](https://github.com/open-mmlab/mmdetection) repository
39+
- Custom search patterns for FakeQuantize operation insertion
40+
- Quantization of the model input by default
41+
- Support of quantization of non-ReLU models (ELU, sigmoid, swish, hswish, and others)
42+
43+
## New in Release 1.0
44+
45+
- Support of symmetric quantization and two sparsity algorithms with fine-tuning
46+
- Automatic model graph transformation. The model is wrapped by the custom class and additional layers are inserted in the graph. The transformations are configurable.
47+
- Three training samples which demonstrate usage of compression methods from the NNCF:
48+
- Image Classification: torchvision models for classification and custom models on ImageNet and CIFAR10/100 datasets.
49+
- Object Detection: SSD300, SSD512, MobileNet SSD on Pascal VOC2007, Pascal VOC2012, and COCO datasets.
50+
- Semantic Segmentation: UNet, ICNet on CamVid and Mapillary Vistas datasets.
51+
- Unified interface for compression methods.
52+
- GPU-accelerated *Quantization* layer for fast model fine-tuning.
53+
- Distributed training support in all samples.
54+
- Configuration file examples for sparsity, quantization and sparsity with quantization for all three samples. Each type of compression requires only one additional stage of fine-tuning.
55+
- Export models to the ONNX format that is supported by the [OpenVINO](https://github.com/opencv/dldt) toolkit.

docker/README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
## Step 1. Install docker
2+
Review the instructions for installation docker [here](https://docs.docker.com/engine/install/ubuntu/) and configure HTTP or HTTPS proxy behavior as [here](https://docs.docker.com/config/daemon/systemd/).
3+
4+
## Step 2. Install nvidia-docker
5+
6+
*Skip this step if you don't have GPU.*
7+
8+
Review the instructions for installation docker [here](https://github.com/NVIDIA/nvidia-docker)
9+
10+
## Step 3. Build image
11+
In the project folder run in terminal:
12+
```
13+
sudo docker image build --network=host --build-arg http_proxy=http://example.com:80 --build-arg https_proxy=http://example.com:81 --build-arg
14+
ftp_proxy=http://example.com:80 <PATH_TO_DIR_WITH_DOCKERFILE>
15+
```
16+
17+
*Use `--http_proxy` , `--https_proxy`, `--ftp_proxy`, `--network` to duplicate the network settings of your localhost into context build*
18+
19+
## Step 4. Run container
20+
Run in terminal:
21+
```
22+
sudo docker run --name <NAME_CONTAINER> --runtime=nvidia -it --network=host --mount type=bind,source=<PATH_TO_DATASETS_ON_HOST>,target=<PATH_TO_DATSETS_IN_CONTAINER> --mount type=bind,source=<PATH_TO_NNCF_HOME_ON_HOST>,target=/home/nncf/ <ID_IMAGE>
23+
```
24+
25+
*You should not use `--runtime=nvidia` if you want to use `--cpu-only` mode.*
26+
27+
*Use `--shm-size` to increase the size of the shared memory directory.*
28+
29+
Now you have a working container and you can run examples.
30+

0 commit comments

Comments
 (0)