reproducibilityindex.ai

Distilling GANs with Style-Mixed Triplets for X2I Translation with Limited Data

Authors: Yaxing Wang, Joost van de weijer, Lu Yu, SHANGLING JUI

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results in a number of image generation tasks (i.e., image-to-image, semantic segmentation-to-image, text-to-image and audio-to-image) demonstrate qualitatively and quantitatively that our method successfully transfers knowledge to the synthetic image generation modules, resulting in more realistic images than previous methods as conﬁrmed by a signiﬁcant drop in the FID.
Researcher Affiliation	Collaboration	Yaxing Wang1,2, Joost van de Weijer2, Lu Yu3 , Shangling Jui4 1 College of Computer Science, Nankai University, China 2 Computer Vision Center, Universitat Aut onoma de Barcelona, Spain 3 School of Computer Science and Engineering, Tianjin University of Technology, China 4 Huawei Kirin Solution, China
Pseudocode	No	The paper describes its methods using prose, diagrams (Figure 2), and mathematical equations, but does not include any explicitly labeled ‘Pseudocode’ or ‘Algorithm’ blocks.
Open Source Code	Yes	Code is available in https://github.com/yaxingwang/KDIT.
Open Datasets	Yes	We conduct multi-class I2I translation on three datasets: Animal faces (Liu et al., 2019), Birds (Van Horn et al., 2015) and Foods (Kawano & Yanai, 2014). We evaluate the proposed method on the CUB bird dataset (Welinder et al., 2010). Oxford-102 (Nilsback & Zisserman, 2008) Cele AMask-HQ (Lee et al., 2020a) dataset.
Dataset Splits	Yes	Text-to-image... Here we use 10 images per class for training, and verify our method on the test dataset. Audio-to-image... 82 categories and 10 images per category are selected for training, and 20 categories and 1,155 images for test. Semantic segmentation-to-image... We randomly select 500 pairs of data from the train set for training, and 2,000 pairs for testing. In cat2dog-200, the training set is composed of 200 images (100 images/per class) and the test set has 200 images (100 images/per class) and the test set has 200 images (100 images/per class). In AFHQ-500, the training set is composed of 500 images (100 images/per class) and the test set has 1500 images (500 images/per class).
Hardware Specification	Yes	We perform the knowledge distillation on one GPU(Quadro RTX6000) with 24GB VRAM.
Software Dependencies	No	The proposed method is implemented in Pytorch (Paszke et al., 2017) and uses Adam (Kingma & Ba, 2014). While these indicate software used, they do not provide specific version numbers for PyTorch or Adam, only citations to their original papers.
Experiment Setup	Yes	We optimize the model using Adam (Kingma & Ba, 2014) with batch size of 16. The learning rates of the generator and the discriminator are set as 0.0001 and 0.0004 with exponential decay rates of (β1, β2) = (0.0, 0.9). The model is trained for 300 epochs for knowledge distillation. In Eq. 6 both αl and β are identical. For the speciﬁc features of which the dimension is less than 128, we set them 0.1. In other case they are 0.01. In Eq. 8 λadv and λkdl are 1, and λsrl is 0.1.