LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood
Authors: Piotr Tempczyk, Rafał Michaluk, Lukasz Garncarek, Przemysław Spurek, Jacek Tabor, Adam Golinski
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We carefully investigate the empirical properties of the proposed method, compare them with our theoretical predictions, show that LIDL yields competitive results on the standard benchmarks for this problem, and that it scales to thousands of dimensions. |
| Researcher Affiliation | Collaboration | 1Institute of Informatics, University of Warsaw 2Polish National Institute for Machine Learning (www.opium.sh) 3deeptale.ai 4Applica 5GMUM, Jagiellonian University 6University of Oxford. |
| Pseudocode | Yes | Algorithm 1 LIDL algorithm |
| Open Source Code | Yes | The code to reproduce our results is available at github.com/opium-sh/lidl. |
| Open Datasets | Yes | We ran LIDL on MNIST, FMNIST, and Celeb-A (D = 1K, 1K, 12K respectively) datasets using Glow as a density estimator. |
| Dataset Splits | No | However, we observed in our experiments that choosing the hyperparameters leading to models minimizing negative log-likelihood on the validation set is a good strategy for minimizing the error of the LID estimate. We apply this approach in all our experiments; as density estimators we employ MAF (Papamakarios et al., 2017), RQ-NSF (Durkan et al., 2019) and Glow (Kingma & Dhariwal, 2018). |
| Hardware Specification | No | Some experiments were performed using the Entropy cluster at the Institute of Informatics, University of Warsaw, funded by NVIDIA, Intel, the Polish National Science Center grant UMO2017/26/E/ST6/00622 and ERC Starting Grant TOTAL. Some experiments were performed using the GU SLARZ 9000 workstation at the Polish National Institute for Machine Learning. |
| Software Dependencies | No | We used MAF (Papamakarios et al., 2017), RQ-NSF (Durkan et al., 2019) and Glow (Kingma & Dhariwal, 2018) models in our experiments. |
| Experiment Setup | Yes | In scalability experiments we used 3 types of datasets. Uniform distribution on interval (0, 1) on a hypercube (denoted by UN, where N is dimensionality of a cube), multivariate Gaussian (NN RN) where N is dimensionality of a distribution and data space, and (NN R2N), where we embedded N-dimensional Gaussian in 2N-dimensional space by duplicating each coordinate. In each experiment we used 11 δs between 0.025 and 0.1. |