Skip to content

dooyounggo/MyConvNet

Repository files navigation

MyConvNet

Deep learning using TensorFlow low-level APIs.

Build your own convolutional neural networks using TensorFlow.

Supports image classification and semantic segmentation tasks.

DCGAN is now available.

Post-training quantization is now supported.

Verified on Windows 10 and Ubuntu 18.04 using PyCharm with Anaconda.

Check out the instruction.

Getting started with Linux.

VGGNet demo: Edit classification/parameters_vgg.py and run classification/demo_vgg.py

How To Run

  • Download all the files.
  • Prepare your data using scripts in subsets/.
  • Build your own networks by modifying scripts in models/.
  • Edit parameters.py to change the dataset, model, directories, etc...
  • Run train.py to train the model.
  • Run test.py to test the trained model.
  • Use inference.py if you have no label for test data.
  • Run quantize.py to perform post-training quantization (pydot package required).

How to prepare data

Images and labels should be paired and stored in the same directory (default).

Some scripts may not support command-line execution.

Notes

  • If you have no NVIDIA GPU, set 'num_gpus' parameter to 0 in order to utilize a CPU for training/inference.
  • Our RandomResizedCrop performs padding prior to cropping so that (each side of an image) ≥ √(max_scale·H·W).
    • Set padding=False for random_resized_crop() to use RandomResizedCrop without padding.
    • We are doing an experiment inspired by "Fixing the train-test resolution discrepancy".
      • Which is extending the RandomResizedCrop scale range from [0.08, 1.0] to [0.04, 1.96] (includes padding).
  • In the segmentation task, pixels with a value of 0 are ignored, so assign 1 to the first class.
  • Use Linux for faster training.
  • Multi-GPU training is available based on the parameter server strategy.
  • NCCL-based distributed training code is curruntly not available (nccl/).
  • Batch statistics of multiple devices are updated successively.
  • Check out REFERENCES.md for papers and code references.

Packages (Installed with conda)

  • Python: 3.7
  • tensorflow-gpu: >= 1.14.0 (cudatoolkit: 10.0, cudnn: 7.6.5)
  • numpy: 1.17.4
  • scikit-image: 0.15.0
  • scikit-learn: 0.22
  • matplotlib: 3.1.1
  • opencv-python: 4.1.2.30 (installed with pip)
  • pydot: 1.4.1 (graphviz: 2.40.1)

TODO

  • Speedup: Training is slower than tf_cnn_benchmark.
  • Object detection task.
  • Multi-model optimization including knowledge distillation.

Checkpoints

ImageNet - Images are subtracted by 0.5 and multiplied by 2, ranging in [-1.0, 1.0]

Model Top-1 Acc Top-5 Acc Train (Test) Image/Input Size Details Param Ckpt
ResNet-v1.5-50 76.35% 92.94% 224/224 (256/224) Inception preprocessing (baseline) *.py *.zip
ResNet-v1.5-50 76.50% 93.06% 224/224 (256/224) + 30 epochs (120 in total) *.py *.zip
ResNet-v1.5-50 77.02% 93.24% 224/224 (256/224) + Cosine LR, decoupled WD 4e-5, dropout 0.3 *.py *.zip
ResNet-v1.5-50 77.51% 93.80% 224/224 (256/224) + Extended crop scale
[0.08, 1.0] -> [0.04, 1.96]
*.py *.zip
Efficient
Net-B0
76.82% 93.21% 224/224 (256/224) Baseline (terminated at epoch 330 due to instability) *.py *.zip
Efficient
Net-B0
77.01% 93.42% 224/224 (256†/224) + 30 epochs (380 in total),
extended crop scale
*.py *.zip
Efficient
Net-Lite0
75.36%
(75.22%‡)
92.63%
(92.43%‡)
224/224 (256†/224) 380 epochs, extended crop scale *.py *.zip
Efficient
Net-Lite0
75.62%
(75.16%‡)
92.62%
(92.33%‡)
224/224 (256†/224) kernel_size=4 for stride=2 convolutions *.py *.zip
  • The reported accuracies are single-crop validation scores.
  • Note that the class numbers are ordered by the synset IDs (train.txt, val.txt). Refer to ilsvrc_2012_cls.py and this page.
    • Therefore, the class ordering is different from the one in the devkit.
  • Image size refers to the size after preprocessing and input size is about networks' inputs.
    • If image and input sizes do not match, cropping or padding is performed.
  • Training scores are calculated with augmentation and validation is performed with exponential moving average (EMA).
    • As a result, validation scores can surpass training scores in the training curves.
    • EMA is known to play a crucial role in training EfficientNet.
  • † Crop method is slightly different, which is center crop of a √(HW) by √(HW) region, zero padding, and resize.
  • ‡ Accuracy after post-training quantization.