damo-vilab / AnyDoor
- пятница, 22 декабря 2023 г. в 00:00:03
Official implementations for paper: Anydoor: zero-shot object-level image customization
Xi Chen
·
Lianghua Huang
·
Yu Liu
·
Yujun Shen
·
Deli Zhao
·
Hengshuang Zhao
The University of Hong Kong | Alibaba Group | Ant Group
![]() |
Install with conda
:
conda env create -f environment.yaml
conda activate anydoor
or pip
:
pip install -r requirements.txt
Additionally, for training, you need to install panopticapi, pycocotools, and lvis-api.
pip install git+https://github.com/cocodataset/panopticapi.git
pip install pycocotools -i https://pypi.douban.com/simple
pip install lvis
Download AnyDoor checkpoint:
Download DINOv2 checkpoint and revise /configs/anydoor.yaml
for the path (line 83)
Download Stable Diffusion V2.1 if you want to train from scratch.
We provide inference code in run_inference.py
(from Line 222 - ) for both inference single image and inference a dataset (VITON-HD Test). You should modify the data path and run the following code. The generated results are provided in examples/TestDreamBooth/GEN
for single image, and VITONGEN
for VITON-HD Test.
python run_inference.py
The inferenced results on VITON-Test would be like [garment, ground truth, generation].
Noticing that AnyDoor does not contain any specific design/tuning for tryon, we think it would be helpful to add skeleton infos or warped garment, and tune on tryon data to make it better :)
![]() |
Our evaluation data for DreamBooth an COCOEE coud be downloaded at Google Drive:
Currently, we suport local gradio demo. To launch it, you should firstly modify /configs/demo.yaml
for the path to the pretrained model, and /configs/anydoor.yaml
for the path to DINOv2(line 83).
Afterwards, run the script:
python run_gradio_demo.py
The gradio demo would look like the UI shown below:
📢 This version requires users to annotate the mask of the target object, too coarse mask would influence the generation quality. We plan to add mask refine module or interactive segmentation modules in the demo.
📢 We provide an segmentation module to refine the user annotated reference mask. We could chose to disable it by setting use_interactive_seg: False
in /configs/demo.yaml
.
![]() |
/configs/datasets.yaml
and modify the corresponding paths../datasets
../datasets/Preprocess/uvo_process.py
run_dataset_debug.py
to verify you data is correct.sh ./scripts/convert_weight.sh
Modify the training hyper-parameters in run_train_anydoor.py
Line 26-34 according to your training resources. We verify that using 2-A100 GPUs with batch accumulation=1 could get satisfactory results after 300,000 iterations.
Start training by executing:
sh ./scripts/train.sh
@bdsqlsz
This project is developped on the codebase of ControlNet. We appreciate this great work!
If you find this codebase useful for your research, please use the following entry.
@article{chen2023anydoor,
title={Anydoor: Zero-shot object-level image customization},
author={Chen, Xi and Huang, Lianghua and Liu, Yu and Shen, Yujun and Zhao, Deli and Zhao, Hengshuang},
journal={arXiv preprint arXiv:2307.09481},
year={2023}
}