RECOMBINER: Robust and Enhanced Compression with Bayesian Implicit Neural Representations
Authors: Jiajun He, Gergely Flamich, Zongyu Guo, José Miguel Hernández-Lobato
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments across several data modalities, showcasing that RECOMBINER achieves competitive results with the best INR-based methods and even outperforms autoencoder-based codecs on low-resolution images at low bitrates. |
| Researcher Affiliation | Academia | Jiajun He University of Cambridge jh2383@cam.ac.uk Gergely Flamich University of Cambridge gf332@cam.ac.uk Zongyu Guo University of Science and Technology of China guozy@mail.ustc.edu.cn Jos e Miguel Hern andez-Lobato University of Cambridge jmh233@cam.ac.uk |
| Pseudocode | Yes | We present the pseudocode of this prior learning algorithm in Algorithm 1. Then, our training step is a three-step coordinate descent process analogous to Guo et al. (2023) s: ... Algorithm 1 Training RECOMBINER: the prior, the linear transform A and upsampling network ϕ |
| Open Source Code | Yes | Our Py Torch implementation is available at https://github.com/cambridge-mlg/RECOMBINER/. |
| Open Datasets | Yes | We evaluate RECOMBINER on the CIFAR-10 (Krizhevsky et al., 2009) and Kodak (Kodak, 1993) image datasets... Following the experimental set-up of Guo et al. (2023), we evaluate our method on the Libri Speech (Panayotov et al., 2015) dataset... We evaluate RECOMBINER on UCF-101 action recognition dataset (Soomro et al., 2012)... We evaluate RECOMBINER on the Saccharomyces cerevisiae proteome from the Alpha Fold DB v4. |
| Dataset Splits | No | The paper describes training and test sets but does not explicitly provide specific details for a separate validation split (percentages, counts, or predefined splits) for hyperparameter tuning or early stopping. For example, for CIFAR-10, it states: 'It has a training set of 50,000 images and a test set of 10,000 images. We randomly select 15,000 images from the training set for the training stage and evaluate RD performance on all test images.' |
| Hardware Specification | Yes | The encoding speed is measured on a single NVIDIA A100-SXM-80GB GPU. On CIFAR-10 and protein structures, we compress signals in batch, with a batch size of 500 images and 1,000 structures, respectively. On Kodak, audio, and video datasets, we compress each signal separately. We should note that the batch size does not influence the results. Posteriors of signals within one batch are optimized in parallel, and their gradients are not crossed. The decoding speed is measured per signal on CPU. |
| Software Dependencies | No | The paper mentions 'Our Py Torch implementation' and that 'Video compression baselines are implemented by ffmpeg (Tomar, 2006)', but it does not specify exact version numbers for PyTorch or ffmpeg, nor does it list versions for other key software components. |
| Experiment Setup | Yes | In Section C.1 DATASETS AND MORE DETAILS ON EXPERIMENTS, the paper provides a table titled 'Table 2: Hyperparameters for images, audio, video, and protein structure compression.' This table lists detailed settings for INR Architecture (layers, hidden units, Fourier embeddings dimension, output dimension, number of parameters), Training Stage (training size, epochs, optimizer, sample size, initial posterior variance, initial posterior mean, initial Arls values, ϵC, β adjustment details), and Posterior Inferring and Compression Stage (gradient descent iteration, optimizer, sample size, blocks per signal, bits per block). |