Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Dr. RAW: Towards General High-Level Vision from RAW with Efficient Task Conditioning

Authors: Wenjun Huang, Ziteng Cui, Yinqiang Zheng, Yirui He, Tatsuya Harada, Mohsen Imani

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of Dr. RAW across 4 representative RAW-based high-level vision tasks under a total of 9 diverse conditions (see Fig. 1). Including object detection [37; 20], semantic segmentation [15], instance segmentation [12] and pose estimation [30]. Our method not only outperforms previous SOTA approaches in accuracy but also achieves superior training efficiency.
Researcher Affiliation	Academia	Wenjun Huang University of California, Irvine Ziteng Cui The University of Tokyo Yinqiang Zheng The University of Tokyo Yirui He University of California, Irvine Tatsuya Harada The University of Tokyo, RIKEN AIP Mohsen Imani University of California, Irvine
Pseudocode	No	The paper describes the architecture of the re-mosaicing block and various components with equations and descriptions, but does not present any structured pseudocode or algorithm blocks. For instance, Appendix A details the 'Re-mosaicing Block Architecture' and includes equation (7), but it is not formatted as pseudocode.
Open Source Code	Yes	The source code is available here.
Open Datasets	Yes	We conducted experiments on semantic segmentation, object detection, instance segmentation, and pose estimation, utilizing a combination of various synthetic and real-world RAW image datasets. For object detection, we adopted 2 real-world datasets, PASCAL RAW [37; 15] and LOD [20]. For semantic segmentation, we utilized ADE20K RAW [15]. For instance segmentation, we utilized LIS [12]. As for pose estimation, we used Ex LPose [30].
Dataset Splits	Yes	For object detection, we adopted 2 real-world datasets, PASCAL RAW [37; 15] and LOD [20]. LOD is a real-world dataset consisting of 2230 low-light condition RAW images taken by a Canon EOS 5D Mark IV camera with 8 object classes. We took 1800 images as the training set and the other 430 images as the test set. PASCAL RAW is a normal-light condition dataset with 4259 RAW images... The training and test split of ADE20K RAW is the same as ADE20K. ... As for pose estimation, we used Ex LPose [30], which collected 2556 images of 251 scenes; 2,065 of 201 scenes are used for training, and the remaining 491 of 50 scenes are kept for testing.
Hardware Specification	Yes	All experiments were conducted on a server equipped with four NVIDIA RTX A6000 GPUs.
Software Dependencies	Yes	The software environment includes Python 3.8, Py Torch 1.12, MMDetection 3.3.0, MMSegmentation 1.2.1, and MMPose 1.3.2.
Experiment Setup	Yes	Implementation Details. Dr. RAW is built on the open-source computer vision toolboxes: mmdetection [11], mmsegmentation [13], and mmpose [14]. We conducted comparative experiments with the current SOTA methods. All comparison methods adopt the same data augmentation, mainly including random crop, random flip, multi-scale test, etc. We use mean Intersection over Union (m Io U) to evaluate semantic segmentation, and mean Average Precision (m AP) to evaluate instance segmentation, object detection, and pose estimation performance. The backbone of Dr. RAW is a Swin Transformer tiny (Swin-T) [35]. ... The experiments are repeated 5 times, and the reported number is the mean of all runs.