Feedback Efficient Online Fine-Tuning of Diffusion Models

Authors: Masatoshi Uehara, Yulai Zhao, Kevin Black, Ehsan Hajiramezanali, Gabriele Scalia, Nathaniel Lee Diamant, Alex M Tseng, Sergey Levine, Tommaso Biancalani

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a theoretical analysis providing a regret guarantee, as well as empirical validation across three domains: images, biological sequences, and molecules.
Researcher Affiliation Collaboration 1Genentech 2Princeton University 3University of California, Berkeley.
Pseudocode Yes Algorithm 1 SEIKO (Optimi Stic fin E-tuning of d Iffusion with KL c Onstraint)
Open Source Code No The paper references public codebases for baselines (e.g., 'The implementation is based on the public DDPO (Black et al., 2023) codebase2.'), but it does not provide an explicit statement or link to the open-source code for its own proposed methodology (SEIKO).
Open Datasets Yes GFP. The original dataset size is 56086. ... We selected the top 33637 samples following Trabucco et al. (2022) and trained diffusion models and oracles using this selected data.ZINC. The ZINC dataset for molecules is a large and freely accessible collection of chemical compounds used in drug discovery and computational chemistry research (Irwin and Shoichet, 2005).
Dataset Splits No The paper mentions training models on datasets (e.g., 'trained diffusion models and oracles using this selected data' for GFP) but does not specify any training, validation, or test dataset splits or percentages.
Hardware Specification Yes In all our implementations, we utilize A100 GPUs.
Software Dependencies No The paper mentions using 'ADAM as an optimizer' but does not specify version numbers for any software dependencies, libraries, or frameworks used in the experiments.
Experiment Setup Yes In all our implementations, we utilize A100 GPUs. For the fine-tuning of diffusion models, we employ the specific set of hyperparameters outlined in Table 7. (from Section D.3.3. HYPERPARAMETERS) and Table 7 itself provides specific values for batch size, KL parameter, UCB parameter, number of bootstrap heads, step size, epochs, and learning rates. Table 8 also provides comprehensive hyperparameters for image tasks.