Verifying the Union of Manifolds Hypothesis for Image Data
Authors: Bradley CA Brown, Anthony L. Caterini, Brendan Leigh Ross, Jesse C Cresswell, Gabriel Loaiza-Ganem
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically verify this hypothesis on commonly-used image datasets, finding that indeed, observed data lies on a disconnected set and that intrinsic dimension is not constant. |
| Researcher Affiliation | Collaboration | Bradley C.A. Brown University of Waterloo bcabrown@uwaterloo.ca Anthony L. Caterini Layer 6 AI anthony@layer6.ai Brendan Leigh Ross Layer 6 AI brendan@layer6.ai Jesse C. Cresswell Layer 6 AI jesse@layer6.ai Gabriel Loaiza-Ganem Layer 6 AI gabriel@layer6.ai |
| Pseudocode | Yes | Algorithm 1: Training of disconnected DGMs |
| Open Source Code | Yes | Our code is available at https://github.com/layer6ai-labs/UoMH. |
| Open Datasets | Yes | We use the FID score (Heusel et al., 2017) (lower is better), a commonly-used sample quality metric, to measure performance on the MNIST, FMNIST (Xiao et al., 2017), SVHN (Netzer et al., 2011), CIFAR-10, and CIFAR-100 (Krizhevsky et al., 2009) datasets. |
| Dataset Splits | Yes | For all models, we randomly select 10% of the training dataset to be used for validation and train on the remaining 90%. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used, such as GPU models, CPU models, or cloud computing instances. |
| Software Dependencies | No | The paper mentions software components like 'ADAM optimizer' and 'Re LU activations', but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | A batch size of 128 is used for all datasets. Unless otherwise noted, at the beginning of training, we scale all the data to between 0 and 1. For all experiments, we use the ADAM optimizer (Kingma & Ba, 2015), typically with learning rate 0.001 and cosine annealing for a maximum of 100 epochs. We also use gradient norm clipping with a value of 10. |