This repo contains the code for our paper OneFormer: One Transformer to Rule Universal Image Segmentation.
Features
OneFormer is the first multi-task universal image segmentation framework based on transformers.
OneFormer needs to be trained only once with a single universal architecture, a single model, and on a single dataset , to outperform existing frameworks across semantic, instance, and panoptic segmentation tasks.
OneFormer uses a task-conditioned joint training strategy, uniformly sampling different ground truth domains (semantic instance, or panoptic) by deriving all labels from panoptic annotations to train its multi-task model.
OneFormer uses a task token to condition the model on the task in focus, making our architecture task-guided for training, and task-dynamic for inference, all with a single model.
OneFormer sets new SOTA on Cityscapes val with single-scale inference on Panoptic Segmentation with 68.5 PQ score and Instance Segmentation with 46.7 AP score!
OneFormer sets new SOTA on ADE20K val on Panoptic Segmentation with 50.2 PQ score and on Instance Segmentation with 37.6 AP!
OneFormer sets new SOTA on COCO val on Panoptic Segmentation with 58.0 PQ score!
Installation Instructions
We use Python 3.8, PyTorch 1.10.1 (CUDA 11.3 build).
We use Detectron2-v0.6.
For complete installation instructions, please see INSTALL.md.
Dataset Preparation
We experiment on three major benchmark dataset: ADE20K, Cityscapes and COCO 2017.
We train all our models using 8 A6000 (48 GB each) GPUs.
We use 8 A100 (80 GB each) for training Swin-L† OneFormer and DiNAT-L† OneFormer on COCO and all models with ConvNeXt-XL† backbone. We also train the 896x896 models on ADE20K on 8 A100 GPUs.
If you found OneFormer useful in your research, please consider starring ⭐ us on GitHub and citing 📚 us in your research!
@article{jain2022oneformer,
title={OneFormer: One Transformer to Rule Universal Image Segmentation},
author={Jitesh Jain and Jiachen Li and MangTik Chiu and Ali Hassani and Nikita Orlov and Humphrey Shi},
journal={arXiv},
year={2022}
}