On the Convergence Rate of Gaussianization with Random Rotations
Authors: Felix Draxler, Lars Kühmichel, Armand Rousselot, Jens Müller, Christoph Schnoerr, Ullrich Koethe
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show analytically that the number of required layers scales linearly with the dimension for Gaussian input. We argue that this is because the model is unable to capture dependencies between dimensions. Empirically, we find the same linear increase in cost for arbitrary input p(x), but observe favorable scaling for some distributions. We explore potential speed-ups and formulate challenges for further research. |
| Researcher Affiliation | Academia | 1Heidelberg University, Germany. Correspondence to: Felix Draxler <felix.draxler@iwr.uni-heidelberg.de>. |
| Pseudocode | No | The paper describes mathematical operations and components of Gaussianization but does not provide pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code available at: https://github.com/vislearn/Gaussianization-Bound |
| Open Datasets | Yes | For our dataset, we construct the same dataset of Gaussians as Draxler et al. (2022). We now consider the scaling behavior of Gaussianization on a real dataset, the EMNIST digits (Cohen et al., 2017). |
| Dataset Splits | No | The paper mentions using N=60,000 training samples but does not specify a separate validation split or its size/percentage. |
| Hardware Specification | No | Experiments were performed on three workstations, each with a single high-end consumer GPU and CPU each. This description is too general and lacks specific model numbers or detailed specifications of the hardware used. |
| Software Dependencies | No | We build our code upon the following python libraries: Py Torch (Paszke et al., 2019), Py Torch Lightning (Falcon & The Py Torch Lightning team, 2019), Lightning Trainable (K uhmichel & Draxler, 2023), Tensorflow (Abadi et al., 2015) for FID score evaluation, Numpy (Harris et al., 2020), Matplotlib (Hunter, 2007) for plotting and Pandas (Wes Mc Kinney, 2010; The pandas development team, 2020) for data evaluation. While libraries are listed, specific version numbers are not provided. |
| Experiment Setup | Yes | We use an implementation of RQ splines based on (Dai & Seljak, 2021), where ψ(x, α) = (1 α)RQ(x) + αx with a scalar regularization parameter α. We choose b = 128 bins, as well as α1 = 0.9 for the spline and α2 = 0.99 for the linear extrapolation... We use Adam with a learning rate of 10 3 and a batch size of 256. We train the normalizing flows for 30 epochs for D = 28 28, and 20 for the other scales... |