reproducibilityindex.ai

AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks

Authors: Yonggan Fu, Wuyang Chen, Haotao Wang, Haoran Li, Yingyan Lin, Zhangyang Wang

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate AGD in two representative GAN tasks: image translation and super resolution.
Researcher Affiliation	Academia	1Rice University, Houston, Texas, USA 2Texas A&M University, College Station, Texas, USA.
Pseudocode	Yes	Algorithm 1 The Proposed Auto GAN-Distiller Framework
Open Source Code	Yes	Our codes and pretrained models are available at: https:// github.com/TAMU-VITA/AGD.
Open Datasets	Yes	We apply AGD on compressing Cycle GAN (Zhu et al., 2017) and consider two datasets, horse2zebra (Zhu et al., 2017) and summer2winter (Zhu et al., 2017). We apply AGD on compressing ESRGAN (Wang et al., 2018a) on a combined dataset of DIV2K and Flickr2K (Timofte et al., 2017)
Dataset Splits	Yes	We split the training dataset into two halves: one for updating supernet weight and the other for updating architecture parameters.
Hardware Specification	Yes	For the efﬁciency aspect, we measure the model size and the inference FLOPs (ﬂoating-point operations). As both might not always be aligned with the hardware performance, we further measure the real-device inference latency using NVIDIA GEFORCE RTX 2080 Ti (NVIDIA Inc.).
Software Dependencies	No	The paper discusses optimizers (SGD, Adam) and loss functions but does not specify software platforms (e.g., PyTorch, TensorFlow) or their version numbers, nor other ancillary software dependencies.
Experiment Setup	Yes	For AGD on Cycle GAN, λ in Eq. 1 is 1 10-17, ω1 and ω2 in Eq. 2 are set to 1/4 and 3/4, and β1, β2 and β3 in Eq. 3 are set to be 1 10-2, 1, and 5 10-8, respectively. We pretrain and search for 50 epochs, with batch size 2. We use an SGD optimizer with a momentum of 0.9 and the initial learning rate 1 10-1 for the weights, which linearly decays to 0 after 10 epochs, and an Adam optimizer with a constant 3 10-4 learning rate for architecture parameters. We train the searched architecture from scratch for 400 epochs, with a batch size of 16 and an initial learning rate of 1 10-1, which linearly decays to 0 after 100 epochs.