Addressing Negative Transfer in Diffusion Models
Authors: Hyojun Go, Kim, Yunsung Lee, Seunghyun Lee, Shinhyeok Oh, Hyeongdon Moon, Seungtaek Choi
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the efficacy of proposed clustering and its integration with MTL methods through various experiments, demonstrating 1) improved generation quality and 2) faster training convergence of diffusion models. |
| Researcher Affiliation | Collaboration | Twelvelabs1 Wrtn Technologies2 Riiid3 EPFL4 Yanolja5 |
| Pseudocode | No | The paper describes algorithms but does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | Our project page is available at https://gohyojun15.github. io/ANT_diffusion/. |
| Open Datasets | Yes | We evaluated our proposed methods through extensive experiments on widely-recognized datasets: FFHQ [27], Celeb A-HQ [26], and Image Net [7]. |
| Dataset Splits | No | The paper describes training on datasets like FFHQ and Celeb A-HQ but does not explicitly provide training/validation/test dataset splits (percentages or counts) or refer to specific predefined splits with citations for reproduction. |
| Hardware Specification | Yes | All experiments are conducted with a single A100 GPU and with FFHQ dataset [27]... A single A100 GPU is used for experiments in Section 5.1 and 5.3. |
| Software Dependencies | No | The paper mentions software components such as Adam W optimizer, DDIM sampler, Lib MTL, and Clean-FID, but does not provide specific version numbers for these or other key software dependencies. |
| Experiment Setup | Yes | All training was performed with Adam W optimizer [43] with the learning rate as 1e-4 or 2e-5, and better results were reported. For ADM, we trained 1M iteration with batch size 8 for the FFHQ dataset and trained 400K iterations with batch size 16 for the Celeb A-HQ dataset. For LDM, we trained 400K iterations with batch size 30 for both FFHQ and Celeb A-HQ datasets. |