Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
OSCAR: One-Step Diffusion Codec Across Multiple Bit-rates
Authors: Jinpei Guo, Yifei Ji, Zheng Chen, Kai Liu, Min Liu, Wang Rao, Wenbo Li, Yong Guo, Yulun Zhang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that OSCAR achieves superior performance in both quantitative and visual quality metrics. The code and models are available at https://github.com/jp-guo/OSCAR. |
| Researcher Affiliation | Collaboration | Jinpei Guo1,2, Yifei Ji2, Zheng Chen2, Kai Liu2, Min Liu1, Wang Rao1, Wenbo Li3, Yong Guo4, Yulun Zhang2 1Carnegie Mellon University, 2Shanghai Jiao Tong University, 3Joy Future Academy, 4South China University of Technology |
| Pseudocode | Yes | An overview of our OSCAR is shown in Fig. 3, and the corresponding pseudocode is provided in the Appendix. |
| Open Source Code | Yes | The code and models are available at https://github.com/jp-guo/OSCAR. |
| Open Datasets | Yes | We curate the training data from DF2K, which combines 800 images from DIV2K [2], 2,650 from Flickr2K [48], and an additional subset from LSDIR [33], resulting in 88,441 high-quality images in total. For evaluation, we benchmark OSCAR on three standard datasets: Kodak [18] (24 natural images, 768 512), DIV2K-val [2] (100 images), and CLIC2020 [49] (428 images) |
| Dataset Splits | Yes | We curate the training data from DF2K, which combines 800 images from DIV2K [2], 2,650 from Flickr2K [48], and an additional subset from LSDIR [33], resulting in 88,441 high-quality images in total. For evaluation, we benchmark OSCAR on three standard datasets: Kodak [18] (24 natural images, 768 512), DIV2K-val [2] (100 images), and CLIC2020 [49] (428 images), where all DIV2K-val and CLIC2020 images are center-cropped to 1024 1024. |
| Hardware Specification | Yes | Training is conducted for 10,000 iterations on a single NVIDIA RTX A6000 GPU. The second stage is trained for 150,000 iterations using eight NVIDIA RTX A6000 GPUs. |
| Software Dependencies | No | Our approach builds upon Stable Diffusion 2.1 [42], which comprises approximately 965.9M parameters. During the first training phase (Section 3.3), we optimize all hyper-encoders in parallel using the Adam optimizer [28]... In the second stage, we train our model with the Adam W optimizer [35]... We apply Lo RA [25] with a rank of 16... |
| Experiment Setup | Yes | During the first training phase (Section 3.3), we optimize all hyper-encoders in parallel using the Adam optimizer [28], with a fixed learning rate of 2 10 5 and a batch size of 64. Training is conducted for 10,000 iterations on a single NVIDIA RTX A6000 GPU. In the second stage, we train our model with the Adam W optimizer [35], setting the learning rate to 5 10 5, weight decay to 1 10 5, and batch size to 64 for both OSCAR and the discriminator. We apply Lo RA [25] with a rank of 16 for efficient fine-tuning. The discriminator follows the same training protocol as OSCAR. The perceptual loss weight λ1 is set to 1, and the adversarial loss weight λ2 is 5 10 3. This stage is trained for 150,000 iterations using eight NVIDIA RTX A6000 GPUs. |