Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth

Authors: Kevin Kögler, Aleksandr Shevchenko, Hamed Hassani, Marco Mondelli

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on syntethic data confirm our findings, and similar phenomena are displayed when running gradient descent to compress CIFAR-10/MNIST images. Taken together, our results show that, for the compression of structured data, a more expressive decoding architecture provably improves performance. This is in sharp contrast with the compression of unstructured, Gaussian data where, as discussed in Section 6 of (Shevchenko et al., 2023), multiple decoding layers do not help.
Researcher Affiliation Academia 1ISTA, Klosterneuburg, Austria 2Department of Electrical and Systems Engineering, University of Pennsylvania, USA.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described, nor does it explicitly state that code will be released or is available.
Open Datasets Yes We validate our findings on image datasets, such as CIFAR-10 and MNIST.
Dataset Splits No The paper mentions SGD training and evaluation on datasets but does not explicitly provide details about train/validation/test splits, percentages, or methodology for splitting.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU/CPU models, memory) used for running its experiments, only general mentions of gradient descent training.
Software Dependencies No The paper mentions using a "straight-through approximation" with a temperature τ fixed to 0.1 for the sign activation, and `scipy.special.hyp1f1` for numerical evaluation, but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes To overcome this issue for SGD training of the models described in the main body, we use a straight-through (see for example (Yin et al., 2019)) approximation of it... For the experiments we fix the temperature τ to the value of 0.1. ... Let the step size η be Θ(1/d).