Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models

Authors: Chen Henry Wu, Saman Motamed, Shaunak Srivastava, Fernando D De la Torre

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate how Prompt Gen can efficiently sample from several unconditional generative models (e.g., Style GAN2, Style Ne RF, diffusion autoencoder, NVAE) in a controlled or/and de-biased manner using various off-the-shelf models
Researcher Affiliation Academia Chen Henry Wu, Saman Motamed, Shaunak Srivastava, Fernando De la Torre Robotics Institute, Carnegie Mellon University, Pittsburgh, PA {chenwu2,ftorre}@cs.cmu.edu, {saman.moatamed,shaunak1999}@gmail.com
Pseudocode Yes Algorithm 1: Generative Visual Prompt (Prompt Gen) and Algorithm 2: Approximating Latent-Space EBM with INN
Open Source Code Yes 1The code is available at https://github.com/Chen Wu98/Generative-Visual-Prompt.
Open Datasets Yes Figure 1 demonstrates our main findings with Style GAN2 trained on FFHQ [35]... Style Ne RF [21], diffusion autoencoder [57], and NVAE [75]... AFHQ-Cats [6]... Landscape-HQ [70]... Big GAN [4] on Image Net [62]... classifier trained on Fair Face 2242 [32]... Celeb A [47]
Dataset Splits No The paper mentions using 'standard splits for public datasets' in Appendix B.8, but does not provide specific percentages, sample counts, or detailed splitting methodologies for training, validation, and test sets to reproduce the data partitioning.
Hardware Specification Yes We trained the INN with Adam [38] with a learning rate of 1e-4 and a batch size of 64 on NVIDIA RTX A4000 GPUs. For training the INN, it takes approximately 10 hours for 200k iterations with a single RTX A4000 GPU.
Software Dependencies No The paper mentions using 'Adam [38]' as an optimizer but does not provide specific version numbers for programming languages, libraries, or other software dependencies.
Experiment Setup Yes We trained the INN with Adam [38] with a learning rate of 1e-4 and a batch size of 64 on NVIDIA RTX A4000 GPUs. All hyperparameters, including learning rate, batch size, and network architecture, are listed in the code for reproducibility.