Addressing Negative Transfer in Diffusion Models

Authors: Hyojun Go, Kim, Yunsung Lee, Seunghyun Lee, Shinhyeok Oh, Hyeongdon Moon, Seungtaek Choi

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate the efficacy of proposed clustering and its integration with MTL methods through various experiments, demonstrating 1) improved generation quality and 2) faster training convergence of diffusion models.
Researcher Affiliation Collaboration Twelvelabs1 Wrtn Technologies2 Riiid3 EPFL4 Yanolja5
Pseudocode No The paper describes algorithms but does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code Yes Our project page is available at https://gohyojun15.github. io/ANT_diffusion/.
Open Datasets Yes We evaluated our proposed methods through extensive experiments on widely-recognized datasets: FFHQ [27], Celeb A-HQ [26], and Image Net [7].
Dataset Splits No The paper describes training on datasets like FFHQ and Celeb A-HQ but does not explicitly provide training/validation/test dataset splits (percentages or counts) or refer to specific predefined splits with citations for reproduction.
Hardware Specification Yes All experiments are conducted with a single A100 GPU and with FFHQ dataset [27]... A single A100 GPU is used for experiments in Section 5.1 and 5.3.
Software Dependencies No The paper mentions software components such as Adam W optimizer, DDIM sampler, Lib MTL, and Clean-FID, but does not provide specific version numbers for these or other key software dependencies.
Experiment Setup Yes All training was performed with Adam W optimizer [43] with the learning rate as 1e-4 or 2e-5, and better results were reported. For ADM, we trained 1M iteration with batch size 8 for the FFHQ dataset and trained 400K iterations with batch size 16 for the Celeb A-HQ dataset. For LDM, we trained 400K iterations with batch size 30 for both FFHQ and Celeb A-HQ datasets.