intel-isl / DPT
- среда, 31 марта 2021 г. в 00:27:01
Dense Prediction Transformers
This repository contains code and models for our paper:
Vision Transformers for Dense Prediction
René Ranftl, Alexey Bochkovskiy, Vladlen Koltun
weights
folder:Monodepth:
Segmentation:
Set up dependencies:
conda install pytorch torchvision opencv
pip install timm
The code was tested with Python 3.7, PyTorch 1.8.0, OpenCV 4.5.1, and timm 0.4.5
Place one or more input images in the folder input
.
Run a monocular depth estimation model:
python run_monodepth.py
Or run a semantic segmentation model:
python run_segmentation.py
The results are written to the folder output_monodepth
and output_segmentation
, respectively.
Use the flag -t
to switch between different models. Possible options are dpt_hybrid
(default) and dpt_large
.
Please cite our papers if you use this code or any of the models.
@article{Ranftl2021,
author = {Ren\'{e} Ranftl and Alexey Bochkovskiy and Vladlen Koltun},
title = {Vision Transformers for Dense Prediction},
journal = {ArXiv preprint},
year = {2021},
}
@article{Ranftl2020,
author = {Ren\'{e} Ranftl and Katrin Lasinger and David Hafner and Konrad Schindler and Vladlen Koltun},
title = {Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
year = {2020},
}
Our work builds on and uses code from timm and PyTorch-Encoding. We'd like to thank the authors for making these libraries available.
MIT License