Diffusion Models Encode the Intrinsic Dimension of Data Manifolds

Authors: Jan Pawel Stanczuk, Georgios Batzolis, Teo Deveney, Carola-Bibiane Schönlieb

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To the best of our knowledge our method is the first estimator of intrinsic dimension based on diffusion models and it outperforms well established estimators in controlled experiments on both Euclidean and image data.
Researcher Affiliation Academia 1Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, United Kingdom 2Department of Mathematical Sciences, University of Bath, Bath, United Kingdom.
Pseudocode Yes Algorithm 1 Estimate the intrinsic dimension at x0
Open Source Code Yes The code is available at https: //github.com/GBATZOLIS/ID-diff.
Open Datasets Yes Additionally, we apply ID estimators to the MNIST dataset (Le Cun and Cortes, 2010) (where the ID is unknown)...
Dataset Splits No At the end we used checkpoints which minimized the validation loss to evaluate the reconstruction error. While validation is mentioned, specific dataset splits (percentages or counts) are not provided.
Hardware Specification Yes We trained the auto-encoder for each latent dimension for 36h on NVIDIA A-100 GPU.
Software Dependencies No The paper mentions using the 'Adam algorithm', 'SCIKIT-LEARN implementation', and an 'R package INTRINSICDIMENSION' but does not specify their version numbers or other software dependencies with version numbers.
Experiment Setup Yes For the optimisation of the model, we used the Adam algorithm with a learning rate of 2e 5 and exponential moving average (EMA) on the weights of the model with a decay rate of 0.9999. Moreover, we chose the variance exploding SDE (Song et al., 2020) as the forward process with σmin = 0.01 and σmax = 4. Hyperparameters indicated in Table 3.