Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Enhancing Consistency-Based Image Generation via Adversarialy-Trained Classification and Energy-Based Discrimination

Authors: Shelly Golan, Roy Ganz, Michael Elad

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present an evaluation of the proposed boosting method. We begin by discussing the results obtained by our method compared to BIGROC [6]. Subsequently, we conduct an ablation study to further analyze various components of the model, as described in Section 3. [...] Table 1 summarizes our results on the Image Net 64 64 dataset [3].
Researcher Affiliation Academia Shelly Golan Technion EMAIL Roy Ganz Technion EMAIL Michael Elad Technion EMAIL
Pseudocode Yes Algorithm 1 Targeted Projected Gradient Descent (PGD) [...] Algorithm 2 Boosting Images via PGD Guided by a Joint Classifier-Discriminator [...] Algorithm 3 Boosting Images via SGLD Guided by a Joint Classifier-Discriminator
Open Source Code No We intend to share all our code through Git Hub after the review process.
Open Datasets Yes achieve an improved FID scores on the Image Net 64x64 dataset for both Consistency-Training and Consistency-Distillation techniques.
Dataset Splits No The paper does not explicitly state training, validation, and test dataset splits needed to reproduce the experiment, such as specific percentages or sample counts for each split. It mentions training on ImageNet 64x64 and evaluating on a test set, but lacks details on how the data was partitioned.
Hardware Specification Yes The model is trained on four NVIDIA GeForce RTX 3090 GPUs.
Software Dependencies No The paper mentions using specific models like 'Res Net-50 (RN50)' and 'Wide-Res Net-50-2 (Wide-RN)', but it does not provide specific version numbers for software dependencies such as deep learning frameworks (e.g., PyTorch, TensorFlow), programming languages (e.g., Python), or other libraries used for the experiments.
Experiment Setup Yes During training, we use the pre-trained robust Res Net-50 (RN50) model [21] as a backbone, with a randomly initialized classification head. We configure the parameters of the Projected Gradient Descent (PGD) attack for fake images with an ε of 10.0 and a step size of 0.1, increasing the number of PGD steps every 1,000 training iterations to enhance adversarial robustness over time. For real images, the PGD attack is set with an ε of 1.0, a step size of 0.25 and 4 steps. [...] In the sampling process, we apply the loss function outlined in Equation 12 and the step size is set as 0.1. The amount of steps is adjusted according to the generative model we enhance; further technical details can be found in Appendix D. Table 5 in Appendix D provides specific PGD Steps and Step Size for various generative methods.