Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
architect.py		architect.py
genotypes.py		genotypes.py
model.py		model.py
model_search.py		model_search.py
operations.py		operations.py
test.py		test.py
test_imagenet.py		test_imagenet.py
train.py		train.py
train_imagenet.py		train_imagenet.py
train_search.py		train_search.py
utils.py		utils.py
vis_cell.py		vis_cell.py

README.md

Searching CNN architectures with SGAS

We run 10 independent searches to get 10 architectures with Criterion 1 or Criterion 2 on on CIFAR-10. We evaluate the discovered architectures on CIFAR-10 and report the mean and standard deviation of the test accuracy across those 10 models and the performance of the best model.

We choose the 3 best performing cell architectures on CIFAR-10 for each Criterion and train them on ImageNet.

CIFAR-10 will be downloaded automatically. To obtain ImageNet dataset, please follow the instructions here.

Search on CIFAR-10

The search takes about 0.25 day (6 hours) on a single NVIDIA GTX 1080Ti. To search with SGAS Cri2, run:

python train_search.py --use_history

To search with SGAS Cri1, run:

python train_search.py

Train on CIFAR-10

To train the best architectures (Cri1_CIFAR_Best, Cri2_CIFAR_Best) from scratch, run:

python train.py --auxiliary --cutout --arch Cri1_CIFAR_Best

or

python train.py --auxiliary --cutout --arch Cri2_CIFAR_Best

Train on ImageNet

To train the best architectures (Cri1_ImageNet_Best, Cri1_ImageNet_Best) from scratch, run:

python train_imagenet.py --auxiliary --arch Cri1_ImageNet_Best --batch_size 1024 --learning_rate 0.5

or

python train_imagenet.py --auxiliary --arch Cri2_ImageNet_Best --batch_size 1024 --learning_rate 0.5

We run these experiments on 8 Nvidia Tesla V100 GPUs for three days by setting --batch_size 1024 and --learning_rate 0.5.

Set --arch to any architecture you want. (One can find more architectures from genotyps.py).

Pretrained models

Our pretrained models can be found from Google Cloud.

Test pretrained models on CIFAR-10

Use the parameter --model_path to set a specific pretrained model to load. For example, to test the best architecture Cri1_CIFAR_Best or Cri2_CIFAR_Best, run:

python test.py --auxiliary --arch Cri1_CIFAR_Best --model_path Cri1_CIFAR_Best.pt

Expected result: 2.39% test error with 3.8M model params.

or

python test.py --auxiliary --arch Cri2_CIFAR_Best --model_path Cri2_CIFAR_Best.pt

Expected result: 2.44% test error with 4.1M model params.

Test pretrained models on ImageNet

To test the best architecture Cri1_ImageNet_Best or Cri2_ImageNet_Best, run:

python test_imagenet.py --auxiliary --arch Cri1_ImageNet_Best --model_path Cri1_ImageNet_Best.pt

Expected result: 24.2% top1 test error with 5.3M model params.

or

python test_imagenet.py --auxiliary --arch Cri2_ImageNet_Best --model_path Cri2_ImageNet_Best.pt

Expected result: 24.1% top1 test error with 5.4M model params.

Ablation

In our ablation study on hyper-parameters, we find setting --decision_freq as 7 yield more stable results. Run:

python train_search.py --use_history --decision_freq 7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cnn

cnn

README.md

Searching CNN architectures with SGAS

Search on CIFAR-10

Train on CIFAR-10

Train on ImageNet

Pretrained models

Test pretrained models on CIFAR-10

Test pretrained models on ImageNet

Ablation

Files

cnn

Directory actions

More options

Directory actions

More options

Latest commit

History

cnn

Folders and files

parent directory

README.md

Searching CNN architectures with SGAS

Search on CIFAR-10

Train on CIFAR-10

Train on ImageNet

Pretrained models

Test pretrained models on CIFAR-10

Test pretrained models on ImageNet

Ablation