Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods

Authors: Aleksandr Shevchenko, Kevin Kögler, Hamed Hassani, Marco Mondelli

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Going beyond the Gaussian assumption on the data, we provide numerical validation to our theoretical predictions on standard datasets, both in the isotropic case (Figure 1) and for general covariance (Figure 2). Additional numerical results together with the details of the experimental setting are in Appendix I.
Researcher Affiliation Academia 1ISTA, Klosterneuburg, Austria 2Department of Electrical and Systems Engineering, University of Pennsylvania, USA.
Pseudocode No The paper describes algorithms like projected gradient descent in text and mathematical formulas but does not provide a clearly labeled
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes Going beyond the Gaussian assumption on the data, we provide numerical validation to our theoretical predictions on standard datasets, both in the isotropic case (Figure 1) and for general covariance (Figure 2). Additional numerical results together with the details of the experimental setting are in Appendix I. ... Figures 1-2 (as well as those in Appendix I) show an excellent agreement between our predictions (using the empirical covariance of the data) and the performance of autoencoders trained on standard datasets (CIFAR-10, MNIST).
Dataset Splits No The paper mentions augmenting the data and preprocessing steps but does not specify explicit training, validation, and test splits (e.g., 80/10/10) or refer to standard predefined splits for the datasets used.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models, memory, or cloud computing instance types.
Software Dependencies No The paper mentions using a
Experiment Setup Yes For the numerical experiments, we pick τ ∈ [0.01, 0.2], with the exact value depending on the specific setting. ... For the experiments on natural images, we augment the data of each class 15 times. ... The whitening procedure used in the experiments concerning isotropic data is performed as follows: given the centered augmented data X ∈ Rnsamples×d, we compute its empirical covariance matrix given by Σˆ = 1/nsamples−1 Pnsamples i=1 Xi,:X i,: , and then we multiply each input by the inverse square root of it, i.e., Xˆi,: = Σˆ −1/2Xi,:.