Diffusion Tuning: Transferring Diffusion Models via Chain of Forgetting
Authors: Jincheng Zhong, Xingzhuo Guo, Jiaxiang Dong, Mingsheng Long
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct comprehensive experiments to evaluate Diff-Tuning, including the transfer of pre-trained Diffusion Transformer models to eight downstream generations and the adaptation of Stable Diffusion to five control conditions with Control Net. |
| Researcher Affiliation | Academia | Jincheng Zhong , Xingzhuo Guo , Jiaxiang Dong, Mingsheng Long School of Software, BNRist, Tsinghua University, China {zjc22,gxz23,djx20}@mails.tsinghua.edu.cn, mingsheng@tsinghua.edu.cn |
| Pseudocode | Yes | Algorithm 1 Pseudo-code of Diff-Tuning |
| Open Source Code | Yes | Code is available at this repository: https://github.com/thuml/Diffusion-Tuning. |
| Open Datasets | Yes | Class-conditioned generation is a fundamental application of diffusion models. To fully evaluate transfer efficiency, we adhere to the benchmarks with a resolution of 256 × 256 as used in Diff Fit [54], including datasets such as Food101 [4], SUN397 [53], DF20-Mini [40], Caltech101 [13], CUB-200-2011 [50], Art Bench-10 [30], Oxford Flowers [37], and Stanford Cars [26]. |
| Dataset Splits | Yes | SUN397 [53] ...We evaluate the methods on a random partition of the whole dataset with 76,128 training images, 10,875 validation images and 21,750 test images. |
| Hardware Specification | Yes | For each result, we fine-tune 24K iterations with a batch size of 32 for standard fine-tuning and Diff-Tuning, and a batch size of 64 for Diff Fit, on one NVIDIA A100 40G GPU. |
| Software Dependencies | No | All experiments are implemented by Pytorch and conducted on NVIDIA A100 40G GPUs. However, specific version numbers for PyTorch or other software dependencies are not provided. |
| Experiment Setup | Yes | Table 3: Hyperparameters of experiments. Class-conditional: Backbone DiT, Image Size 256, Batch Size 32, Learning Rate 1e-4, Optimizer Adam, Training Steps 24,000, Validation Interval 24,000, Sampling Steps 50, Augmented Dataset Size 200,000. Controlled: Backbone Stable-diffusion v1.5, Image Size 512, Batch Size 4, Learning Rate 1e-5, Optimizer Adam, Training Steps 15,000, Validation Interval 100, Sampling Steps 50, Augmented Dataset Size 30,000. |