reproducibilityindex.ai

Exploring Diffusion Time-steps for Unsupervised Representation Learning

Authors: Zhongqi Yue, Jiankun Wang, Qianru Sun, Lei Ji, Eric I-Chao Chang, Hanwang Zhang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On Celeb A, FFHQ, and Bedroom datasets, the learned feature significantly improves attribute classification and enables faithful counterfactual generation, e.g., interpolating only one specified attribute between two images, validating the disentanglement quality. Codes are in https://github.com/yue-zhongqi/diti.
Researcher Affiliation	Collaboration	1Nanyang Technological University, 2Singapore Management University,3Microsoft Research Asia,4Skywork AI
Pseudocode	No	The paper describes its proposed approach and implementation details in prose and equations, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Codes are in https://github.com/yue-zhongqi/diti.
Open Datasets	Yes	Datasets. We choose real-world datasets to validate if Di Ti learns a disentangled representation of the generative attributes: 1) Celebrity Faces Attributes (Celeb A) Liu et al. (2015) is a large-scale face attributes dataset. ... 2) Flickr-Faces-HQ (FFHQ) Karras et al. (2019) contains 70,000 high-quality face images obtained from Flickr. 3) We additionally used the Labeled Faces in the Wild (LFW) dataset Huang et al. (2007) that provides continuous attribute labels. 4) Bedroom is part of the Large-scale Scene UNderstanding (LSUN) dataset Yu et al. (2015) that contains around 3 million images.
Dataset Splits	No	The paper mentions 'Celeb A train split' and 'Celeb A test split' but does not explicitly state a separate validation split or specific details for a validation set.
Hardware Specification	Yes	Our experiments were performed on 4 NVIDIA A100 GPUs.
Software Dependencies	No	The paper mentions software components like 'U-Net' and 'pre-trained DM', but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	Yes	We followed the network design of encoder f and decoder g in PDAE and adopted its hyper-parameter settings (e.g., λt, wt in Eq. 4, details in Appendix). This ensures that any emerged property of disentangled representation is solely from our leverage of the inductive bias in Section 4.1. We also used the same training iterations as PDAE, i.e., 290k iterations on Celeb A, 500k iterations on FFHQ, and 540k iterations on Bedroom.