Towards Empirical Sandwich Bounds on the Rate-Distortion Function
Authors: Yibo Yang, Stephan Mandt
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We estimate R-D sandwich bounds for a variety of artificial and real-world data sources, in settings far beyond the feasibility of any known method, and shed light on the optimality of neural data compression (Ball e et al., 2021; Yang et al., 2022). Our R-D upper bound on natural images indicates theoretical room for improving state-of-the-art image compression methods by at least one d B in PSNR at various bitrates. Our data and code can be found here. |
| Researcher Affiliation | Academia | Yibo Yang, Stephan Mandt Department of Computer Science, UC Irvine {yibo.yang,mandt}@uci.edu |
| Pseudocode | Yes | We give a more detailed derivation and pseudocode in Appendix A.4. Algorithm 1: Example implementation of the proposed algorithm for estimating ratedistortion lower bound RL(D). |
| Open Source Code | Yes | Our data and code can be found here. |
| Open Datasets | Yes | We consider the Z-boson decay dataset from Howard et al. (2021)... We then repeat the same experiments on speech data from the Free Spoken Digit Dataset (Jackson et al., 2018)... We used images from the COCO 2017 (Lin et al., 2014) training set... evaluated them on the Kodak (1993) and Tecnick (Asuni & Giachetti, 2014) datasets. |
| Dataset Splits | Yes | We also use a small subset of the training data as a validation set for model selection, typically using the most expressive model we can afford without overfitting. |
| Hardware Specification | Yes | Our experiments on images were run on Titan RTX GPUs, while the rest of the experiments were run on Intel(R) Xeon(R) CPUs. |
| Software Dependencies | No | The paper mentions software like "Tensorflow library" and "tensorflow-compression library", but does not specify their version numbers. |
| Experiment Setup | Yes | We used the Adam optimizer for gradient based optimization in all our experiments, typically setting the learning rate between 1e 4 and 5e 4. Training the β-VAEs for the upper bounds required from a few thousand gradient steps on the lower-dimension problems (under an hour), to a few million gradient steps on the image compression problem (a couple of days; similar to reported in Minnen & Singh (2020)). We fixed k = 1024 for the results with varying n in Figure 2a-bottom. We parameterize log u by an MLP with 2 hidden layers with 20n hidden units each and Se LU activation. |