Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
MoRIC: A Modular Region-based Implicit Codec for Image Compression
Authors: Gen Li, Haotian Wu, Deniz Gunduz
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Mo RIC on DAVIS [39], Kodak [28] and CLIC2020 datasets [52], comparing it with classical codecs (VTM [9]), autoencoder-based neural codecs (EVC [54], MLIC++[26]), and recent overfitted codecs (C3[27], COOL-CHIC v4 [32], and Lottery Codec[56]). RD performance is assessed using peak signal-to-noise ratio (PSNR) on RGB channels and BD-rate [20], while MACs per pixel and coding latency are reported to evaluate decoding complexity and coding efficiency. |
| Researcher Affiliation | Academia | Department of Electrical and Electronic Engineering Imperial College London, London SW7 2AZ, U.K. EMAIL |
| Pseudocode | Yes | Algorithm 1 C coding of Mo RIC Algorithm 2 Encoding stage of the Mo RIC Algorithm 3 Decoding stage of the Mo RIC |
| Open Source Code | Yes | We include an open-resource section to introduce the open-sourced components of our work. The code and model checkpoints are submitted as part of the supplementary materials. A more complete version of the codebase, including detailed documentation and usage instructions, will be released publicly after the review process. Code and checkpoints are released in the project page. |
| Open Datasets | Yes | We evaluate Mo RIC on DAVIS [39], Kodak [28] and CLIC2020 datasets [52]. (a) Datasets. We open-source our region-based coding dataset, including the DAVIS set (18 images) used in our experiments and our region segmentation results for the Kodak (24 images) and CLIC (41 images) datasets. |
| Dataset Splits | No | The paper mentions specific datasets (DAVIS, Kodak, CLIC2020) and the number of images in some of them. It describes how regions were selected or used from these datasets. However, it does not provide specific training/test/validation splits (e.g., percentages, sample counts, or references to predefined splits with citations) for reproducing the model training or evaluation process beyond merely stating the datasets used. |
| Hardware Specification | Yes | Table 5: Average coding time for single-object compression on the DAVIS dataset using an NVIDIA RTX 3090 GPU and Intel Core i9-10980XE CPU @ 3.00GHz. Table 10: Average coding time for standard full image compression on Kodak dataset using an NVIDIA RTX 3090 GPU and Intel Core i9-10980XE CPU @ 3.00GHz. |
| Software Dependencies | Yes | For VTM, we use the Compress AI library [4] to implement VTM-19.1 (YUV 10 bits), with additional details (code and datapoints) provided in the supplementary materials. |
| Experiment Setup | Yes | Table 2: Hyper-parameter settings Hyper parameter Initial values Final values Values of λ {2e 2, 8e 3, 4e 3, 2e 3, 1e 3, 5e 4, 2e 4} Quantization Stage I Number of encoding steps 105 Learning rate β 10 2 0 Scheduler for learning rate Cosine scheduler Temperature T for soft rounding 0.3 0.1 Noise strength α for Kumaraswamy noise 2.0 1.0 Scheduler for Soft-rounding and Kumaraswamy noise Linear scheduler Quantization Stage II Number encoding steps 104 Learning rate 10 4 10 8 Decay learning rate if loss has not improved for this many steps 40 Decay factor 0.8 Temperature T for soft rounding 10 4 Architecture Entropy model Alternative values of c for ARM-c model c {8/16/24/32} Activation function GELU Log-scale of Laplace is shifted before exp 4 Scale parameter of Laplace is clipped to [10 2, 150] Architecture Latent modulations Values Number of latent vectors L 7 Initialization of z 0 Architecture Global Modulation Network Single-object vs. Standard Upsampling kernel 4 4 vs. 8 8 Output channels of the 1 1 convolutions {7 3/5 3} vs. {7 18/24/30 3} Output channels of the 3 3 convolutions {3 3} Modulation vector dimension {3 3} Architecture LSN Values Output dimensions of each layer {2 3 3 3} Output of each shared modulator (object) {6/8 3} Out-channels of each shared modulator (standard) {24/30/36 3}; {27/33/39 3}; {30/36/42 3} |