Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods
Authors: Aleksandr Shevchenko, Kevin Kögler, Hamed Hassani, Marco Mondelli
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Going beyond the Gaussian assumption on the data, we provide numerical validation to our theoretical predictions on standard datasets, both in the isotropic case (Figure 1) and for general covariance (Figure 2). Additional numerical results together with the details of the experimental setting are in Appendix I. |
| Researcher Affiliation | Academia | 1ISTA, Klosterneuburg, Austria 2Department of Electrical and Systems Engineering, University of Pennsylvania, USA. |
| Pseudocode | No | The paper describes algorithms like projected gradient descent in text and mathematical formulas but does not provide a clearly labeled |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Going beyond the Gaussian assumption on the data, we provide numerical validation to our theoretical predictions on standard datasets, both in the isotropic case (Figure 1) and for general covariance (Figure 2). Additional numerical results together with the details of the experimental setting are in Appendix I. ... Figures 1-2 (as well as those in Appendix I) show an excellent agreement between our predictions (using the empirical covariance of the data) and the performance of autoencoders trained on standard datasets (CIFAR-10, MNIST). |
| Dataset Splits | No | The paper mentions augmenting the data and preprocessing steps but does not specify explicit training, validation, and test splits (e.g., 80/10/10) or refer to standard predefined splits for the datasets used. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models, memory, or cloud computing instance types. |
| Software Dependencies | No | The paper mentions using a |
| Experiment Setup | Yes | For the numerical experiments, we pick τ ∈ [0.01, 0.2], with the exact value depending on the specific setting. ... For the experiments on natural images, we augment the data of each class 15 times. ... The whitening procedure used in the experiments concerning isotropic data is performed as follows: given the centered augmented data X ∈ Rnsamples×d, we compute its empirical covariance matrix given by Σˆ = 1/nsamples−1 Pnsamples i=1 Xi,:X i,: , and then we multiply each input by the inverse square root of it, i.e., Xˆi,: = Σˆ −1/2Xi,:. |