RECOMBINER: Robust and Enhanced Compression with Bayesian Implicit Neural Representations

Authors: Jiajun He, Gergely Flamich, Zongyu Guo, José Miguel Hernández-Lobato

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments across several data modalities, showcasing that RECOMBINER achieves competitive results with the best INR-based methods and even outperforms autoencoder-based codecs on low-resolution images at low bitrates.
Researcher Affiliation Academia Jiajun He University of Cambridge jh2383@cam.ac.uk Gergely Flamich University of Cambridge gf332@cam.ac.uk Zongyu Guo University of Science and Technology of China guozy@mail.ustc.edu.cn Jos e Miguel Hern andez-Lobato University of Cambridge jmh233@cam.ac.uk
Pseudocode Yes We present the pseudocode of this prior learning algorithm in Algorithm 1. Then, our training step is a three-step coordinate descent process analogous to Guo et al. (2023) s: ... Algorithm 1 Training RECOMBINER: the prior, the linear transform A and upsampling network ϕ
Open Source Code Yes Our Py Torch implementation is available at https://github.com/cambridge-mlg/RECOMBINER/.
Open Datasets Yes We evaluate RECOMBINER on the CIFAR-10 (Krizhevsky et al., 2009) and Kodak (Kodak, 1993) image datasets... Following the experimental set-up of Guo et al. (2023), we evaluate our method on the Libri Speech (Panayotov et al., 2015) dataset... We evaluate RECOMBINER on UCF-101 action recognition dataset (Soomro et al., 2012)... We evaluate RECOMBINER on the Saccharomyces cerevisiae proteome from the Alpha Fold DB v4.
Dataset Splits No The paper describes training and test sets but does not explicitly provide specific details for a separate validation split (percentages, counts, or predefined splits) for hyperparameter tuning or early stopping. For example, for CIFAR-10, it states: 'It has a training set of 50,000 images and a test set of 10,000 images. We randomly select 15,000 images from the training set for the training stage and evaluate RD performance on all test images.'
Hardware Specification Yes The encoding speed is measured on a single NVIDIA A100-SXM-80GB GPU. On CIFAR-10 and protein structures, we compress signals in batch, with a batch size of 500 images and 1,000 structures, respectively. On Kodak, audio, and video datasets, we compress each signal separately. We should note that the batch size does not influence the results. Posteriors of signals within one batch are optimized in parallel, and their gradients are not crossed. The decoding speed is measured per signal on CPU.
Software Dependencies No The paper mentions 'Our Py Torch implementation' and that 'Video compression baselines are implemented by ffmpeg (Tomar, 2006)', but it does not specify exact version numbers for PyTorch or ffmpeg, nor does it list versions for other key software components.
Experiment Setup Yes In Section C.1 DATASETS AND MORE DETAILS ON EXPERIMENTS, the paper provides a table titled 'Table 2: Hyperparameters for images, audio, video, and protein structure compression.' This table lists detailed settings for INR Architecture (layers, hidden units, Fourier embeddings dimension, output dimension, number of parameters), Training Stage (training size, epochs, optimizer, sample size, initial posterior variance, initial posterior mean, initial Arls values, ϵC, β adjustment details), and Posterior Inferring and Compression Stage (gradient descent iteration, optimizer, sample size, blocks per signal, bits per block).