Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
PocketSR: The Super-Resolution Expert in Your Pocket Mobiles
Authors: Haoze Sun, Linfeng Jiang, Fan Li, Renjing Pei, Zhixin Wang, Yong Guo, Jiaqi Xu, Haoyu Chen, Jin Han, Fenglong Song, Yujiu Yang, Wenbo Li
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through an in-depth analysis of each design component, we provide valuable insights for future research. Pocket SR, with a model size of 146M parameters, processes 4K images in just 0.8 seconds, achieving a remarkable speedup over previous methods. Notably, it delivers performance on par with state-of-the-art single-step and even multi-step Real SR models, making it a highly practical solution for edge-device applications. Quantitative and Efficiency Comparison. Table 2 reports quantitative results on Real SR [53] and DReal SR [65], with efficiency metrics listed in the last five rows. The results show that our method achieves strong super-resolution performance with excellent computational efficiency. Ablation Study. We carefully analyze the effects of the proposed lite encoder and decoder, online annealing pruning, and multi-layer feature distillation, demonstrating an excellent trade-off between super-resolution quality and efficiency through extensive experiments. |
| Researcher Affiliation | Collaboration | Haoze Sun1 , Linfeng Jiang2*, Fan Li2, Renjing Pei2, Zhixin Wang2, Yong Guo2, Jiaqi Xu2, Haoyu Chen4, Jin Han2, Fenglong Song2, Yujiu Yang1B, Wenbo Li3B 1Tsinghua University 2Huawei 3Joy Future Academy 4HKUST (GZ) |
| Pseudocode | No | The paper describes the methods and equations (e.g., Equation 1) but does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block. The methodology is described through text and figures. |
| Open Source Code | No | Answer: [No] Justification: We are sorry, but due to company policy, we are unable to provide open source code to the community at this time. |
| Open Datasets | Yes | The training dataset comprises approximately 500K high-quality images from LSDIR [63] and 10K images from FFHQ [64]. For evaluation, we follow the protocols in [23, 27] and report results on the DReal SR [65] and Real SR [53] benchmarks. |
| Dataset Splits | Yes | The training dataset comprises approximately 500K high-quality images from LSDIR [63] and 10K images from FFHQ [64]. For evaluation, we follow the protocols in [23, 27] and report results on the DReal SR [65] and Real SR [53] benchmarks. |
| Hardware Specification | Yes | Pocket SR, with a model size of 146M parameters, processes 4K images in just 0.8 seconds, achieving a remarkable speedup over previous methods. Notably, it delivers performance on par with state-of-the-art single-step and even multi-step Real SR models, making it a highly practical solution for edge-device applications. ... and processes a 512 × 512 image in 0.016 seconds on an A100 GPU |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the implementation (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | In the first training phase, we train the unpruned SD U-Net equipped with Lite ED for 80,000 steps. In the second phase, we first apply channel pruning over 80,000 steps, followed by module-wise online annealing pruning for an additional 8,000 steps. The total number of annealing steps is set to T = 8000. A fixed batch size of 64 is used throughout the entire training process. We employ the Adam W optimizer with a learning rate of 1 × 10−4, and the timestep is fixed at t = 999 for one-step diffusion. Additionally, the original text embedding is replaced with a learnable embedding vector. |