clovaai / assembled-cnn
- четверг, 23 января 2020 г. в 00:19:15
Python
Official implementation of "Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network"
paper | pretrained model Official Tensorflow implementation
Jungkyu Lee, Taeryun Won, Kiho Hong
Clova Vision, NAVER Corp.
Abstract
Recent studies in image classification have demonstrated a variety of techniques for improving the performance of Convolutional Neural Networks (CNNs). However, attempts to combine existing techniques to create a practical model are still uncommon. In this study, we carry out extensive experiments to validate that carefully assembling these techniques and applying them to a basic CNN model in combination can improve the accuracy and robustness of the model while minimizing the loss of throughput. For example, our proposed ResNet-50 shows an improvement in top-1 accuracy from 76.3% to 82.78%, and mCE improvement from 76.0% to 48.9%, on the ImageNet ILSVRC2012 validation set. With these improvements, inference throughput only decreases from 536 to 312. The resulting model significantly outperforms state-of-the-art models with similar accuracy in terms of mCE and inference throughput. To verify the performance improvement in transfer learning, fine grained classification and image retrieval tasks were tested on several open datasets and showed that the improvement to backbone network performance boosted transfer learning performance significantly. Our approach achieved 1st place in the iFood Competition Fine-Grained Visual Recognition at CVPR 2019.
Based on our repository, we achieved 1st place in iFood Competition Fine-Grained Visual Recognition at CVPR 2019.
pip install Pillow sklearn requests Wand tqdmWe assume you already have the following data:
First, download pretrained models from here.
DATA_DIR=/path/to/imagenet2012/tfrecord
MODEL_DIR=/path/pretrained/checkpoint
CUDA_VISIBLE_DEVICES=1 python main_classification.py \
--eval_only=True \
--dataset_name=imagenet \
--data_dir=${DATA_DIR} \
--model_dir=${MODEL_DIR} \
--preprocessing_type=imagenet_224_256a \
--resnet_version=2 \
--resnet_size=152 \
--bl_alpha=1 \
--bl_beta=2 \
--use_sk_block=True \
--anti_alias_type=sconv \
--anti_alias_filter_size=3 The expected final output is:
...
| accuracy: 0.841860 |
...
For training parameter information, refer to here
Train vanila ResNet50 on ImageNet from scratch.
$ ./scripts/train_vanila_from_scratch.shTrain all-assemble ResNet50 on ImageNet from scratch.
$ ./scripts/train_assemble_from_scratch.shIn the previous section, you train the pretrained model from scratch. You can also download pretrained model to finetune from here.
Fine-tune vanila ResNet50 on Food101.
$ ./scripts/finetuning_vanila_on_food101.shTrain all-assemble ResNet50 on Food101.
$ ./scripts/finetuning_assemble_on_food101.shYou can calculate mCE on the trained model as follows:
$ ./eval_assemble_mCE_on_imagenet.shThis implementation is based on these repository:
@misc{lee2020compounding,
title={Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network},
author={Jungkyu Lee and Taeryun Won and Kiho Hong},
year={2020},
eprint={2001.06268},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Copyright 2020-present NAVER Corp.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.