Unsupervised Image-to-Image Translation with Density Changing Regularization
Authors: Shaoan Xie, Qirong Ho, Kun Zhang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Despite its simplicity, our method achieves the best performance on benchmark datasets and needs only 56 86% of training time of the existing state-of-the-art method. We apply our method to different image translation tasks and the superior performance across tasks demonstrate the effectiveness of our method. We run all tasks once. We report results of more runs in the supplementary material. Datasets We follow the evaluation protocols in CUT [39] by running experiments on three benchmark datasets: label city, cat dog and horse zebra. We further run experiment on selfie anime dataset to fully verify the effectiveness of our method. Metrics For the label city task, we follow [39] and evaluate the generated photos by a pretrained segmentation model DRN [49]. We compute mean average precision (m AP), pixel-wise accuracy (p Acc) and average class accuracy (c Acc). |
| Researcher Affiliation | Academia | Shaoan Xie1, Qirong Ho2, and Kun Zhang1,2 1 Carnegie Mellon University 2Mohamed bin Zayed University of Artificial Intelligence shaoan@cmu.edu, qirong.ho@mbzuai.ac.ae, kunz1@cmu.edu |
| Pseudocode | No | The paper describes its methods using text and mathematical formulations but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | The training and evaluation code are avaliable at https://github.com/Mid-Push/ Decent. |
| Open Datasets | Yes | Datasets We follow the evaluation protocols in CUT [39] by running experiments on three benchmark datasets: label city, cat dog and horse zebra. We further run experiment on selfie anime dataset to fully verify the effectiveness of our method. |
| Dataset Splits | No | The paper mentions training epochs (e.g., 'We run label city and horse zebra for 400 epochs, cat dog and selfie anime for 200 epochs.') and evaluation protocols but does not explicitly provide information on train/validation/test dataset splits, such as percentages or sample counts for each split. |
| Hardware Specification | Yes | We run all methods on NVIDIA Tesla V100-SXM2 GPU and report their training speed in table 2. |
| Software Dependencies | No | The paper mentions software components like MAF [38], BNAF [10], NSF [11], 9-block ResNet, 3-layer Patch GAN discriminator, and LSGAN [33], but it does not specify their exact version numbers, which is necessary for reproducible software dependencies. |
| Experiment Setup | Yes | The learning rate is 0.0002 with β1 = 0.5, β2 = 0.999. We set L = 5 which is number of feature layers we use following CUT [39]. It means that we use five layers (0,4,8,12,16th) of the generator to extract patch representations. The layers correspond to receptive fields of sizes 1 1, 9 9, 15 15, 35 35, and 99 99. We also use 256 patches in each layer instead of all patches to save computation time and memory. We set λidt = 10 and λdensity = 0.01 across all tasks. For all dataset, we resize images to 256 256. We keep the learning rate for the first half of the training and linearly decay it to 0 in the last half of training. We run label city and horse zebra for 400 epochs, cat dog and selfie anime for 200 epochs. |