reproducibilityindex.ai

LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood

Authors: Piotr Tempczyk, Rafał Michaluk, Lukasz Garncarek, Przemysław Spurek, Jacek Tabor, Adam Golinski

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We carefully investigate the empirical properties of the proposed method, compare them with our theoretical predictions, show that LIDL yields competitive results on the standard benchmarks for this problem, and that it scales to thousands of dimensions.
Researcher Affiliation	Collaboration	1Institute of Informatics, University of Warsaw 2Polish National Institute for Machine Learning (www.opium.sh) 3deeptale.ai 4Applica 5GMUM, Jagiellonian University 6University of Oxford.
Pseudocode	Yes	Algorithm 1 LIDL algorithm
Open Source Code	Yes	The code to reproduce our results is available at github.com/opium-sh/lidl.
Open Datasets	Yes	We ran LIDL on MNIST, FMNIST, and Celeb-A (D = 1K, 1K, 12K respectively) datasets using Glow as a density estimator.
Dataset Splits	No	However, we observed in our experiments that choosing the hyperparameters leading to models minimizing negative log-likelihood on the validation set is a good strategy for minimizing the error of the LID estimate. We apply this approach in all our experiments; as density estimators we employ MAF (Papamakarios et al., 2017), RQ-NSF (Durkan et al., 2019) and Glow (Kingma & Dhariwal, 2018).
Hardware Specification	No	Some experiments were performed using the Entropy cluster at the Institute of Informatics, University of Warsaw, funded by NVIDIA, Intel, the Polish National Science Center grant UMO2017/26/E/ST6/00622 and ERC Starting Grant TOTAL. Some experiments were performed using the GU SLARZ 9000 workstation at the Polish National Institute for Machine Learning.
Software Dependencies	No	We used MAF (Papamakarios et al., 2017), RQ-NSF (Durkan et al., 2019) and Glow (Kingma & Dhariwal, 2018) models in our experiments.
Experiment Setup	Yes	In scalability experiments we used 3 types of datasets. Uniform distribution on interval (0, 1) on a hypercube (denoted by UN, where N is dimensionality of a cube), multivariate Gaussian (NN RN) where N is dimensionality of a distribution and data space, and (NN R2N), where we embedded N-dimensional Gaussian in 2N-dimensional space by duplicating each coordinate. In each experiment we used 11 δs between 0.025 and 0.1.