Zero-shot Image Editing with Reference Imitation
Authors: Xi Chen, Yutong Feng, Mengting Chen, Yiyang Wang, Shilong Zhang, Yu Liu, Yujun Shen, Hengshuang Zhao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally show the effectiveness of our method under various test cases as well as its superiority over existing alternatives. We do not have theoretical results. |
| Researcher Affiliation | Collaboration | 1The University of Hong Kong 2Alibaba Group 3Ant Group |
| Pseudocode | No | The paper describes processes in text and figures (like Figure 3 showing the training process) but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | According to the regulations of the company, we would release the code and benchmark after internal check. |
| Open Datasets | Yes | We collect 100 k high-resolution videos from open-sourced websites like Pexels [29]. To further expand the diversity of training samples, we use the SAM [20] dataset that contains 10 million images and 1 billion object masks. |
| Dataset Splits | Yes | During training, the sampling portions of the video and SAM data are 70% versus 30% as default. |
| Hardware Specification | Yes | Experiments are conducted with a total batch size of 64 on 8 A100 GPUs. |
| Software Dependencies | No | The paper mentions software components such as stable diffusion-1.5, CLIP, and DINOv2, and Adam optimizer, but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | In this work, all experiments are conducted with the resolution of 512 512... During training, we use the Adam [19] optimizer and set the learning rate as 1e-5... Experiments are conducted with a total batch size of 64... For the masking strategy of the source image, we randomly choose the grid number N N from 3 to 10. We set 75% chances to drop the grid with SIFT-matched features and set 50% chances for other regions. |