Few-Shot Diffusion Models Escape the Curse of Dimensionality

Authors: Ruofeng Yang, Bo Jiang, Cheng Chen, ruinan Jin, Baoxiang Wang, Shuai Li

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The results of the realworld experiments also show that the models obtained by only fine-tuning the encoder and decoder specific to the target distribution can produce novel images with the target feature, which supports our theoretical results.
Researcher Affiliation Academia 1 John Hopcroft Center for Computer Science, Shanghai Jiao Tong University 2 East China Normal University 3 The Chinese University of Hong Kong, Shenzhen 4 Vector Institute {wanshuiyin, bjiang, shuaili8}@sjtu.edu.cn, chchen@sei.ecnu.edu.cn, {jinruinan,bxiangwang}@cuhk.edu.cn
Pseudocode No The paper describes algorithms and processes textually and mathematically but does not include a dedicated pseudocode block or algorithm listing.
Open Source Code No Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: As a theoretical work, we simply train and fine-tune a diffusion models on the datasets in Appendix E. All detail is shown in Appendix E.
Open Datasets Yes For the source data, we construct a large dataset (6400 images) with different hairstyles (without the bald feature). For the target data, we choose the bald feature and select 10 images with this feature to constitute the target dataset, which are much smaller than the size of target dataset (Figure 1 (a)).
Dataset Splits No The paper defines source and target datasets for pre-training and fine-tuning, but does not specify explicit train/validation/test splits with percentages or counts for performance evaluation.
Hardware Specification Yes The above experiments are conduct on a Ge Force RTX 4090.
Software Dependencies No The paper mentions using a "U-net network" and "Adam W optimizer" but does not specify version numbers for any software libraries, frameworks, or dependencies used for implementation.
Experiment Setup Yes We train the neural network using Adam W optimizer with learning rate 0.0001. For the pre-trained phase, we train the models for 200 epochs with batch size 20. It takes 5 hours to obtain a pre-trained diffusion models. For the fine-tuning phase, we fine-tune the pre-trained models for 400 epochs with batch size 2.