Enhancing Consistency-Based Image Generation via Adversarialy-Trained Classification and Energy-Based Discrimination

Authors: Shelly Golan, Roy Ganz, Michael Elad

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present an evaluation of the proposed boosting method. We begin by discussing the results obtained by our method compared to BIGROC [6]. Subsequently, we conduct an ablation study to further analyze various components of the model, as described in Section 3. [...] Table 1 summarizes our results on the Image Net 64 64 dataset [3].
Researcher Affiliation Academia Shelly Golan Technion shelly.golan@cs.technion.ac.il Roy Ganz Technion ganz@campus.technion.ac.il Michael Elad Technion elad@cs.technion.ac.il
Pseudocode Yes Algorithm 1 Targeted Projected Gradient Descent (PGD) [...] Algorithm 2 Boosting Images via PGD Guided by a Joint Classifier-Discriminator [...] Algorithm 3 Boosting Images via SGLD Guided by a Joint Classifier-Discriminator
Open Source Code No We intend to share all our code through Git Hub after the review process.
Open Datasets Yes achieve an improved FID scores on the Image Net 64x64 dataset for both Consistency-Training and Consistency-Distillation techniques.
Dataset Splits No The paper does not explicitly state training, validation, and test dataset splits needed to reproduce the experiment, such as specific percentages or sample counts for each split. It mentions training on ImageNet 64x64 and evaluating on a test set, but lacks details on how the data was partitioned.
Hardware Specification Yes The model is trained on four NVIDIA GeForce RTX 3090 GPUs.
Software Dependencies No The paper mentions using specific models like 'Res Net-50 (RN50)' and 'Wide-Res Net-50-2 (Wide-RN)', but it does not provide specific version numbers for software dependencies such as deep learning frameworks (e.g., PyTorch, TensorFlow), programming languages (e.g., Python), or other libraries used for the experiments.
Experiment Setup Yes During training, we use the pre-trained robust Res Net-50 (RN50) model [21] as a backbone, with a randomly initialized classification head. We configure the parameters of the Projected Gradient Descent (PGD) attack for fake images with an ε of 10.0 and a step size of 0.1, increasing the number of PGD steps every 1,000 training iterations to enhance adversarial robustness over time. For real images, the PGD attack is set with an ε of 1.0, a step size of 0.25 and 4 steps. [...] In the sampling process, we apply the loss function outlined in Equation 12 and the step size is set as 0.1. The amount of steps is adjusted according to the generative model we enhance; further technical details can be found in Appendix D. Table 5 in Appendix D provides specific PGD Steps and Step Size for various generative methods.