LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood

Authors: Piotr Tempczyk, Rafał Michaluk, Lukasz Garncarek, Przemysław Spurek, Jacek Tabor, Adam Golinski

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We carefully investigate the empirical properties of the proposed method, compare them with our theoretical predictions, show that LIDL yields competitive results on the standard benchmarks for this problem, and that it scales to thousands of dimensions.
Researcher Affiliation Collaboration 1Institute of Informatics, University of Warsaw 2Polish National Institute for Machine Learning (www.opium.sh) 3deeptale.ai 4Applica 5GMUM, Jagiellonian University 6University of Oxford.
Pseudocode Yes Algorithm 1 LIDL algorithm
Open Source Code Yes The code to reproduce our results is available at github.com/opium-sh/lidl.
Open Datasets Yes We ran LIDL on MNIST, FMNIST, and Celeb-A (D = 1K, 1K, 12K respectively) datasets using Glow as a density estimator.
Dataset Splits No However, we observed in our experiments that choosing the hyperparameters leading to models minimizing negative log-likelihood on the validation set is a good strategy for minimizing the error of the LID estimate. We apply this approach in all our experiments; as density estimators we employ MAF (Papamakarios et al., 2017), RQ-NSF (Durkan et al., 2019) and Glow (Kingma & Dhariwal, 2018).
Hardware Specification No Some experiments were performed using the Entropy cluster at the Institute of Informatics, University of Warsaw, funded by NVIDIA, Intel, the Polish National Science Center grant UMO2017/26/E/ST6/00622 and ERC Starting Grant TOTAL. Some experiments were performed using the GU SLARZ 9000 workstation at the Polish National Institute for Machine Learning.
Software Dependencies No We used MAF (Papamakarios et al., 2017), RQ-NSF (Durkan et al., 2019) and Glow (Kingma & Dhariwal, 2018) models in our experiments.
Experiment Setup Yes In scalability experiments we used 3 types of datasets. Uniform distribution on interval (0, 1) on a hypercube (denoted by UN, where N is dimensionality of a cube), multivariate Gaussian (NN RN) where N is dimensionality of a distribution and data space, and (NN R2N), where we embedded N-dimensional Gaussian in 2N-dimensional space by duplicating each coordinate. In each experiment we used 11 δs between 0.025 and 0.1.