MIDMs: Matching Interleaved Diffusion Models for Exemplar-Based Image Translation
Authors: Junyoung Seo, Gyuseong Lee, Seokju Cho, Jiyoung Lee, Seungryong Kim
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our MIDMs generate more plausible images than state-of-the-art methods. Experiments demonstrate that our MIDMs achieve competitive performance on Celeb A-HQ (Liu et al. 2015) and Deep Fashion (Liu et al. 2016). In particular, user study and qualitative comparison results demonstrate that our method can provide a better realistic appearance while capturing the exemplar s details. An extensive ablation study shows the effectiveness of each component in MIDMs. |
| Researcher Affiliation | Collaboration | 1 Korea University, Seoul, Korea 2 NAVER AI Lab, Korea |
| Pseudocode | Yes | We conduct our all experiments on RTX 3090 GPU, and we provide more implementation details and pseudo code in the Appendix. |
| Open Source Code | No | The paper states 'we provide more implementation details and pseudo code in the Appendix' but does not provide a direct link to a code repository or explicitly state that source code for the methodology is openly available. |
| Open Datasets | Yes | Following the previous literature (Zhang et al. 2020; Zhan et al. 2021b,a), we conduct experiments over the Celeb A-HQ (Liu et al. 2015), and Deep Fashion (Liu et al. 2016) datasets. Celeb A-HQ (Liu et al. 2015) dataset provides 30,000 images of high-resolution human faces at 1024 × 1024 resolution... Deep Fashion (Liu et al. 2016) dataset consists of 52,712 full-length person images in fashion cloths... Also, we use LSUN-Churches (Yu et al. 2015) to conduct the experiments of segmentation maps-to-photos. |
| Dataset Splits | No | The paper mentions using specific datasets and following previous literature for experiments but does not explicitly provide percentages or counts for training, validation, or test splits. It only states total dataset sizes. |
| Hardware Specification | Yes | We conduct our all experiments on RTX 3090 GPU |
| Software Dependencies | No | The paper mentions optimizers (Adam W), pretrained models (VGG-19), and tools (Canny edge detector, Open Pose, Swin-S) but does not provide specific version numbers for software dependencies like Python, PyTorch, or TensorFlow libraries. |
| Experiment Setup | Yes | We use Adam W optimizer (Loshchilov and Hutter 2017) for the learning rate of 3e-6 for the correspondence network, and 1.5e-7 for the backbone network of the diffusion model. We use multi-step learning rate decay with γ = 0.3. |