Enhancing Consistency-Based Image Generation via Adversarialy-Trained Classification and Energy-Based Discrimination
Authors: Shelly Golan, Roy Ganz, Michael Elad
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present an evaluation of the proposed boosting method. We begin by discussing the results obtained by our method compared to BIGROC [6]. Subsequently, we conduct an ablation study to further analyze various components of the model, as described in Section 3. [...] Table 1 summarizes our results on the Image Net 64 64 dataset [3]. |
| Researcher Affiliation | Academia | Shelly Golan Technion shelly.golan@cs.technion.ac.il Roy Ganz Technion ganz@campus.technion.ac.il Michael Elad Technion elad@cs.technion.ac.il |
| Pseudocode | Yes | Algorithm 1 Targeted Projected Gradient Descent (PGD) [...] Algorithm 2 Boosting Images via PGD Guided by a Joint Classifier-Discriminator [...] Algorithm 3 Boosting Images via SGLD Guided by a Joint Classifier-Discriminator |
| Open Source Code | No | We intend to share all our code through Git Hub after the review process. |
| Open Datasets | Yes | achieve an improved FID scores on the Image Net 64x64 dataset for both Consistency-Training and Consistency-Distillation techniques. |
| Dataset Splits | No | The paper does not explicitly state training, validation, and test dataset splits needed to reproduce the experiment, such as specific percentages or sample counts for each split. It mentions training on ImageNet 64x64 and evaluating on a test set, but lacks details on how the data was partitioned. |
| Hardware Specification | Yes | The model is trained on four NVIDIA GeForce RTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions using specific models like 'Res Net-50 (RN50)' and 'Wide-Res Net-50-2 (Wide-RN)', but it does not provide specific version numbers for software dependencies such as deep learning frameworks (e.g., PyTorch, TensorFlow), programming languages (e.g., Python), or other libraries used for the experiments. |
| Experiment Setup | Yes | During training, we use the pre-trained robust Res Net-50 (RN50) model [21] as a backbone, with a randomly initialized classification head. We configure the parameters of the Projected Gradient Descent (PGD) attack for fake images with an ε of 10.0 and a step size of 0.1, increasing the number of PGD steps every 1,000 training iterations to enhance adversarial robustness over time. For real images, the PGD attack is set with an ε of 1.0, a step size of 0.25 and 4 steps. [...] In the sampling process, we apply the loss function outlined in Equation 12 and the step size is set as 0.1. The amount of steps is adjusted according to the generative model we enhance; further technical details can be found in Appendix D. Table 5 in Appendix D provides specific PGD Steps and Step Size for various generative methods. |