InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models
Authors: Yingheng Wang, Yair Schiff, Aaron Gokaslan, Weishen Pan, Fei Wang, Christopher De Sa, Volodymyr Kuleshov
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Info Diffusion on a suite of benchmark datasets and find that it learns latent representations that are competitive with state-of-the-art generative and contrastive methods, while retaining the high sample quality of diffusion models. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Cornell University, Ithaca, NY, USA 2Department of Computer Science, Cornell Tech, New York City, NY, USA 3Department of Population Health Sciences, Weill Cornell Medicine, New York City, NY, USA. |
| Pseudocode | No | The paper includes figures illustrating network architectures but does not contain any formal pseudocode or algorithm blocks labeled as such. |
| Open Source Code | No | The paper does not provide an unambiguous statement about releasing the source code for their method or a direct link to a code repository. |
| Open Datasets | Yes | We measure performance on the following datasets: Fashion MNIST (Xiao et al., 2017), CIFAR10 (Krizhevsky et al., 2009), FFHQ (Karras et al., 2019), Celeb A (Liu et al., 2015), and 3DShapes (Burgess & Kim, 2018). |
| Dataset Splits | Yes | We split the data into 80% training and 20% test, fit the classifier on the training data, and evaluate on the test set. We repeat this 5-fold and report mean metrics one standard deviation. |
| Hardware Specification | Yes | Table 7. Hyperparameters for Info Diffusion and baseline training. ... GPU: TITANXP, RTX2080TI, TITANRTX, RTX4090 |
| Software Dependencies | No | The paper lists 'pytorch (Paszke et al., 2019)' and 'scikit-learn (Pedregosa et al., 2011)' with citations, but does not provide specific version numbers (e.g., PyTorch 1.9) for reproducibility. |
| Experiment Setup | Yes | In Table 7, we detail the hyperparameters used in training our Info Diffusion and baseline models, across datasets. We also note that for all of these experiments we use the ADAM optimizer with learning rate 1e 4 and train for 50 epochs. |