quark0 / darts
- четверг, 28 июня 2018 г. в 00:14:11
Python
Differentiable architecture search for convolutional and recurrent networks
Code accompanying the paper
The algorithm is based on continuous relaxation and gradient descent in the architecture space. It is able to efficiently design high-performance convolutional architectures for image classification (on CIFAR-10 and ImageNet) and recurrent architectures for language modeling (on Penn Treebank and WikiText-2). Only a single GPU is required.DARTS: Differentiable Architecture Search
Hanxiao Liu, Karen Simonyan, Yiming Yang.
arXiv:1806.09055.
Python >= 3.5.5, PyTorch == 0.3.1, torchvision >= 0.2.1
PyTorch 0.4 will be supported soon.
Instructions for acquiring PTB and WT2 can be found here. While CIFAR-10 can be automatically downloaded by torchvision, ImageNet needs to be manually downloaded (preferably to a SSD) following the instructions here.
To carry out architecture search, run
cd cnn && python train_search.py --unrolled # for conv cells on CIFAR-10
cd rnn && python train_search.py --unrolled # for recurrent cells on PTB
Snapshots of the most likely convolutional & recurrent cells over time:
To reproduce our results using the best cells, run
cd cnn && python train.py --auxiliary --cutout # CIFAR-10
cd rnn && python train.py # PTB
cd rnn && python train.py --data ../data/wikitext-2 \ # WT2
--dropouth 0.15 --emsize 700 --nhidlast 700 --nhid 700 --wdecay 5e-7
cd cnn && python train_imagenet.py --auxiliary # ImageNet
Customized architectures are supported through the --arch
flag once specified in genotypes.py
.