MyConvNet

Deep learning using TensorFlow low-level APIs.

Build your own convolutional neural networks using TensorFlow.

Supports image classification and semantic segmentation tasks.

DCGAN is now available.

Post-training quantization is now supported.

Verified on Windows 10 and Ubuntu 18.04 using PyCharm with Anaconda.

Check out the instruction.

Getting started with Linux.

VGGNet demo: Edit classification/parameters_vgg.py and run classification/demo_vgg.py

How To Run

Download all the files.
Prepare your data using scripts in subsets/.
Build your own networks by modifying scripts in models/.
Edit parameters.py to change the dataset, model, directories, etc...
Run train.py to train the model.
Run test.py to test the trained model.
Use inference.py if you have no label for test data.
Run quantize.py to perform post-training quantization (pydot package required).

How to prepare data

Images and labels should be paired and stored in the same directory (default).

Open terminal and cd to MyConvNet.
Download, extract, and process datasets:
- CUB-200-2011: http://www.vision.caltech.edu/visipedia/CUB-200-2011.html
  - python -m subsets.cub_200_2011 --data /path/to/raw/data --dest /path/to/processed/data
- ImageNet: http://image-net.org/challenges/LSVRC/2012/downloads (log-in required)
  - python -m subsets.ilsvrc_2012_cls --data /path/to/raw/data --dest /path/to/processed/data
And so on.

Some scripts may not support command-line execution.

Notes

If you have no NVIDIA GPU, set 'num_gpus' parameter to 0 in order to utilize a CPU for training/inference.
Our RandomResizedCrop performs padding prior to cropping so that (each side of an image) ≥ √(max_scale·H·W).
- Set padding=False for random_resized_crop() to use RandomResizedCrop without padding.
- We are doing an experiment inspired by "Fixing the train-test resolution discrepancy".
  - Which is extending the RandomResizedCrop scale range from [0.08, 1.0] to [0.04, 1.96] (includes padding).
In the segmentation task, pixels with a value of 0 are ignored, so assign 1 to the first class.
Use Linux for faster training.
Multi-GPU training is available based on the parameter server strategy.
NCCL-based distributed training code is curruntly not available (nccl/).
Batch statistics of multiple devices are updated successively.
Check out REFERENCES.md for papers and code references.

Packages (Installed with conda)

Python: 3.7
tensorflow-gpu: >= 1.14.0 (cudatoolkit: 10.0, cudnn: 7.6.5)
numpy: 1.17.4
scikit-image: 0.15.0
scikit-learn: 0.22
matplotlib: 3.1.1
opencv-python: 4.1.2.30 (installed with pip)
pydot: 1.4.1 (graphviz: 2.40.1)

TODO

Speedup: Training is slower than tf_cnn_benchmark.
Object detection task.
Multi-model optimization including knowledge distillation.

Checkpoints

ImageNet - Images are subtracted by 0.5 and multiplied by 2, ranging in [-1.0, 1.0]

Model	Top-1 Acc	Top-5 Acc	Train (Test) Image/Input Size	Details	Param	Ckpt
ResNet-v1.5-50	76.35%	92.94%	224/224 (256/224)	Inception preprocessing (baseline)	*.py	*.zip
ResNet-v1.5-50	76.50%	93.06%	224/224 (256/224)	+ 30 epochs (120 in total)	*.py	*.zip
ResNet-v1.5-50	77.02%	93.24%	224/224 (256/224)	+ Cosine LR, decoupled WD 4e-5, dropout 0.3	*.py	*.zip
ResNet-v1.5-50	77.51%	93.80%	224/224 (256^†/224)	+ Extended crop scale [0.08, 1.0] -> [0.04, 1.96]	*.py	*.zip
Efficient Net-B0	76.82%	93.21%	224/224 (256/224)	Baseline (terminated at epoch 330 due to instability)	*.py	*.zip
Efficient Net-B0	77.01%	93.42%	224/224 (256†/224)	+ 30 epochs (380 in total), extended crop scale	*.py	*.zip
Efficient Net-Lite0	75.36% (75.22%‡)	92.63% (92.43%‡)	224/224 (256†/224)	380 epochs, extended crop scale	*.py	*.zip
Efficient Net-Lite0	75.62% (75.16%‡)	92.62% (92.33%‡)	224/224 (256†/224)	kernel_size=4 for stride=2 convolutions	*.py	*.zip

The reported accuracies are single-crop validation scores.
Note that the class numbers are ordered by the synset IDs (train.txt, val.txt). Refer to ilsvrc_2012_cls.py and this page.
- Therefore, the class ordering is different from the one in the devkit.
Image size refers to the size after preprocessing and input size is about networks' inputs.
- If image and input sizes do not match, cropping or padding is performed.
Training scores are calculated with augmentation and validation is performed with exponential moving average (EMA).
- As a result, validation scores can surpass training scores in the training curves.
- EMA is known to play a crucial role in training EfficientNet.
† Crop method is slightly different, which is center crop of a √(HW) by √(HW) region, zero padding, and resize.
‡ Accuracy after post-training quantization.

Name		Name	Last commit message	Last commit date
Latest commit History 1,560 Commits
classification		classification
generative		generative
isp		isp
models		models
segmentation		segmentation
subsets		subsets
README.md		README.md
REFERENCES.md		REFERENCES.md
convnet.py		convnet.py
dataset.py		dataset.py
evaluators.py		evaluators.py
initialization.py		initialization.py
optimizers.py		optimizers.py
quantization.py		quantization.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MyConvNet

How To Run

How to prepare data

Notes

Packages (Installed with conda)

TODO

Checkpoints

ImageNet - Images are subtracted by 0.5 and multiplied by 2, ranging in [-1.0, 1.0]

About

Releases

Packages

Languages

dooyounggo/MyConvNet

Folders and files

Latest commit

History

Repository files navigation

MyConvNet

How To Run

How to prepare data

Notes

Packages (Installed with conda)

TODO

Checkpoints

ImageNet - Images are subtracted by 0.5 and multiplied by 2, ranging in [-1.0, 1.0]

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages