DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models

Authors: Tao Yang, Yuwang Wang, Yan Lu, Nanning Zheng

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on synthetic and real-world datasets demonstrate the effectiveness of Dis Diff. 5 Experiments 5.1 Experimental Setup Implementation Details. Datasets Baselines & Metrics 5.2 Main Results 5.3 Qualitative Results 5.4 Ablation Study
Researcher Affiliation Collaboration Tao Yang1 , Yuwang Wang2 , Yan Lu3 , Nanning Zheng1 yt14212@stu.xjtu.edu.cn, 1National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi an Jiaotong University, 2Tsinghua University, Shanghai AI Laboratory, 3Microsoft Research Asia
Pseudocode Yes Algorithm 1 Training procedure, Algorithm 2 DDPM Sampling, Algorithm 3 DDIM Sampling
Open Source Code Yes https://github.com/thomasmry/Dis Diff
Open Datasets Yes To evaluate disentanglement, we follow Ren et al. [36] to use popular public datasets: Shapes3D [23], a dataset of 3D shapes. MPI3D [12], a 3D dataset recorded in a controlled environment, and Cars3D [35], a dataset of CAD models generated by color renderings. All experiments are conducted on 64x64 image resolution, the same as the literature. For realworld datasets, we conduct our experiments on Celeb A [29].
Dataset Splits No The paper mentions 'batch size as 64 for all datasets' and 'We have ten runs for each method' but does not explicitly provide specific training, validation, or test dataset splits (e.g., percentages or counts).
Hardware Specification No The paper mentions that experiments were conducted on various datasets with a specified image resolution and batch size, but it does not provide any specific details about the hardware used (e.g., GPU models, CPU types, memory).
Software Dependencies No The paper does not specify the versions of any software dependencies used, such as programming languages, deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries.
Experiment Setup Yes During the training of Dis Diff, we set the batch size as 64 for all datasets. We always set the learning rate as 1e 4. We use EMA on all model parameters with a decay factor of 0.9999. Following Dis Co [36], we set the representation vector dimension to 32. Table 4: Encoder E architecture used in Dis Diff. Table 5: Decoder (pre-trained DPM) architecture used in Dis Diff.