Sparsity Aware Normalization for GANs

Authors: Idan Kligvasser, Tomer Michaeli8181-8190

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate the effectiveness of our method through extensive experiments with a variety of network architectures. As we show, sparsity is particularly dominant in critics used for image-to-image translation settings. In these cases our approach improves upon existing methods, in less training epochs and with smaller capacity networks, while requiring practically no computational overhead.
Researcher Affiliation Academia Idan Kligvasser, Tomer Michaeli Technion Israel Institute of Technology, Haifa, Israel {kligvasser@campus, tomer.m@ee}.technion.ac.il
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code No No statement explicitly providing open-source code for the described methodology was found.
Open Datasets Yes We start by performing image generation experiments on the CIFAR-10 (Krizhevsky and Hinton 2009) and STL-10 (Coates, Ng, and Lee 2011) datasets. We use 512 256 images from the Cityscapes dataset (Cordts et al. 2016). We train our network using the 800 DIV2K training images (Agustsson and Timofte 2017), enriched by random cropping and horizontal flipping.
Dataset Splits No The paper mentions training and testing datasets but does not explicitly detail validation splits. For example, it states: "We train our network using the 800 DIV2K training images (Agustsson and Timofte 2017)..."
Hardware Specification No No specific hardware details (e.g., GPU models, CPU types, memory specifications) used for running experiments were mentioned.
Software Dependencies No The paper mentions using specific optimizers and networks (e.g., "Adam optimizer (Kingma and Ba 2015)", "VGG network (Simonyan and Zisserman 2014)") but does not provide specific software environment or library version numbers (e.g., Python, PyTorch/TensorFlow versions).
Experiment Setup Yes We train all networks for 200 epochs with batches of 64 using the Adam optimizer (Kingma and Ba 2015). We use a learning rate of 2 10 4 and momentum parameters β1 = 0.5 and β2 = 0.9. All hyper-parameters are kept as in (Park et al. 2019), except for Adam s first momentum parameter, which we set to β1 = 0.5. We use a batch size of 32 for 400K equal discriminator and generator updates. The learning rate is initialized to 2 10 4 and is decreased by a factor of 2 at 12.5%, 25%, 50% and 75% of the total number of iterations.