Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Verifying the Union of Manifolds Hypothesis for Image Data

Authors: Bradley CA Brown, Anthony L. Caterini, Brendan Leigh Ross, Jesse C Cresswell, Gabriel Loaiza-Ganem

ICLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically verify this hypothesis on commonly-used image datasets, finding that indeed, observed data lies on a disconnected set and that intrinsic dimension is not constant.
Researcher Affiliation Collaboration Bradley C.A. Brown University of Waterloo EMAIL Anthony L. Caterini Layer 6 AI EMAIL Brendan Leigh Ross Layer 6 AI EMAIL Jesse C. Cresswell Layer 6 AI EMAIL Gabriel Loaiza-Ganem Layer 6 AI EMAIL
Pseudocode Yes Algorithm 1: Training of disconnected DGMs
Open Source Code Yes Our code is available at https://github.com/layer6ai-labs/UoMH.
Open Datasets Yes We use the FID score (Heusel et al., 2017) (lower is better), a commonly-used sample quality metric, to measure performance on the MNIST, FMNIST (Xiao et al., 2017), SVHN (Netzer et al., 2011), CIFAR-10, and CIFAR-100 (Krizhevsky et al., 2009) datasets.
Dataset Splits Yes For all models, we randomly select 10% of the training dataset to be used for validation and train on the remaining 90%.
Hardware Specification No The paper does not explicitly describe the specific hardware used, such as GPU models, CPU models, or cloud computing instances.
Software Dependencies No The paper mentions software components like 'ADAM optimizer' and 'Re LU activations', but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes A batch size of 128 is used for all datasets. Unless otherwise noted, at the beginning of training, we scale all the data to between 0 and 1. For all experiments, we use the ADAM optimizer (Kingma & Ba, 2015), typically with learning rate 0.001 and cosine annealing for a maximum of 100 epochs. We also use gradient norm clipping with a value of 10.