Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Overfitting for Fun and Profit: Instance-Adaptive Data Compression
Authors: Ties van Rozendaal, Iris AM Huijben, Taco Cohen
ICLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate an image compression model on I-frames (sampled at 2 fps) from videos of the Xiph dataset, and demonstrate that full-model adaptation improves RD performance by 1 d B, with respect to encoder-only finetuning. |
| Researcher Affiliation | Collaboration | Ties van Rozendaal Qualcomm AI Research EMAIL Iris A.M. Huijben Qualcomm AI Research Department of Electrical Engineering Eindhoven University of Technology EMAIL Taco S. Cohen Qualcomm AI Research EMAIL |
| Pseudocode | Yes | The paper includes 'Algorithm 1 Encoding of x' and 'Algorithm 2 Decoding of x' sections on page 4. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing open-source code or links to a code repository. |
| Open Datasets | Yes | The CLIC19 dataset is referenced as a public dataset: 'CLIC19 1 The CLIC19 dataset contains a collection of natural high resolution images. [...] 1https://www.compression.cc/2019/challenge/'. The Xiph dataset is also referenced: 'Xiph-5N 2fps 2 The Xiph dataset contains a variety of videos of different formats. [...] 2https://media.xiph.org/video/derf/' |
| Dataset Splits | Yes | The paper states: 'The corresponding validation folds are used to validate the global model performance.' for the CLIC19 dataset, and 'Xiph-5N 2fps is used to validate our instance-adaptive data compression framework.' |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU or CPU models, or cloud instance types used for running experiments. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' (Kingma & Ba, 2014) and 'PyTorch' is implied by the context of neural networks, but no specific version numbers are provided for any software dependencies or libraries. |
| Experiment Setup | Yes | For encoder-only tuning we use a constant learning rate of 1e 6, whereas for latent optimization a learning rate of 5e 4 is used for the low bitrate region (i.e. two highest β values), and 1e 3 for the high rate region. ... The training objective in eq. (2) is expressed in bits per pixel and optimized using a fixed learning rate of 1e 4. The parameters for the model prior were chosen as follows: quantization bin width t = 0.005, standard deviation σ = 0.05, and the multiplicative factor of the spike α = 1000. ... All finetuning experiments (both encoding-only and full-model) ran for 100k steps, each containing one mini-batch of a single, full resolution I-frame. We used the Adam optimizer (default settings) (Kingma & Ba, 2014), and best model selection was based on the RD Ď M loss over the set of I-frames. |