Few-Shot Diffusion Models Escape the Curse of Dimensionality
Authors: Ruofeng Yang, Bo Jiang, Cheng Chen, ruinan Jin, Baoxiang Wang, Shuai Li
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The results of the realworld experiments also show that the models obtained by only fine-tuning the encoder and decoder specific to the target distribution can produce novel images with the target feature, which supports our theoretical results. |
| Researcher Affiliation | Academia | 1 John Hopcroft Center for Computer Science, Shanghai Jiao Tong University 2 East China Normal University 3 The Chinese University of Hong Kong, Shenzhen 4 Vector Institute {wanshuiyin, bjiang, shuaili8}@sjtu.edu.cn, chchen@sei.ecnu.edu.cn, {jinruinan,bxiangwang}@cuhk.edu.cn |
| Pseudocode | No | The paper describes algorithms and processes textually and mathematically but does not include a dedicated pseudocode block or algorithm listing. |
| Open Source Code | No | Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: As a theoretical work, we simply train and fine-tune a diffusion models on the datasets in Appendix E. All detail is shown in Appendix E. |
| Open Datasets | Yes | For the source data, we construct a large dataset (6400 images) with different hairstyles (without the bald feature). For the target data, we choose the bald feature and select 10 images with this feature to constitute the target dataset, which are much smaller than the size of target dataset (Figure 1 (a)). |
| Dataset Splits | No | The paper defines source and target datasets for pre-training and fine-tuning, but does not specify explicit train/validation/test splits with percentages or counts for performance evaluation. |
| Hardware Specification | Yes | The above experiments are conduct on a Ge Force RTX 4090. |
| Software Dependencies | No | The paper mentions using a "U-net network" and "Adam W optimizer" but does not specify version numbers for any software libraries, frameworks, or dependencies used for implementation. |
| Experiment Setup | Yes | We train the neural network using Adam W optimizer with learning rate 0.0001. For the pre-trained phase, we train the models for 200 epochs with batch size 20. It takes 5 hours to obtain a pre-trained diffusion models. For the fine-tuning phase, we fine-tune the pre-trained models for 400 epochs with batch size 2. |