reproducibilityindex.ai

Diffusion Models for Multi-Task Generative Modeling

Authors: Changyou Chen, Han Ding, Bunyamin Sisman, Yi Xu, Ouye Xie, Benjamin Z. Yao, Son Dinh Tran, Belinda Zeng

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results on Image Net indicate the effectiveness of our framework for various multi-modal generative modeling, which we believe is an important research direction worthy of more future explorations.
Researcher Affiliation	Collaboration	Changyou Chen1,2 Han Ding2 Bunyamin Sisman2 Yi Xu2 Ouye Xie2 Benjamin Yao2 Son Tran2 Belinda Zeng2 1University at Buffalo 2Amazon
Pseudocode	Yes	Algorithm 1 MT-Diffusion Inference [in Appendix C]; Algorithm 2 MT-Diffusion Training [in Appendix E]
Open Source Code	No	The paper mentions using existing codebases like 'guided diffusion codebase (dif)' and 'latent diffusion codebase from ldm' but does not provide a link or statement about releasing its own specific implementation code for the proposed MT-Diffusion.
Open Datasets	Yes	we mainly rely on the Image Net-1K dataset (Deng et al., 2009) with resolutions of 64 × 64 and 128 × 128
Dataset Splits	Yes	we mainly rely on the Image Net-1K dataset (Deng et al., 2009) with resolutions of 64 × 64 and 128 × 128, where we adopt the pre-defined training and validation splits.
Hardware Specification	Yes	All experiments are conducted on a A100 GPU server consists of 8 GPUs, with a batchsize of 64, if not explicitly specified.
Software Dependencies	No	The paper mentions using 'guided diffusion codebase (dif)' and 'latent diffusion codebase from ldm', but it does not provide specific version numbers for these software components or other libraries like Python, PyTorch, or CUDA.
Experiment Setup	Yes	We adopt the default hyper-parameters for training the models...Attention resolutions: (32, 16, 9) Diffusion steps: 1000 Learn sigma: False Noise schedule: Linear #channels: 320 #heads: 8 #res blocks: 2 Resblock updown: False Use scale shift norm: False Learning rate: 1.0e-4 Batch size: 32