Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Are Pixel-Wise Metrics Reliable for Computerized Tomography Reconstruction?

Authors: Tianyu Lin, Xinran Li, Chuntung Zhuang, Qi Chen, Yuanhao Cai, Kai Ding, Alan L. Yuille, Zongwei Zhou

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that applying CARE to preexisting sparse-view CT reconstruction methods produces striking gains (summarized in Figure 1; detailed in 4). Compared with nine preexisting methods, our CARE achieves up to 32% for large organs, +22% for small organs, +40% for intestinal structures, and +36% for vascular structures.
Researcher Affiliation	Academia	1Johns Hopkins University 2Yale University 3Johns Hopkins Medicine
Pseudocode	No	The paper describes the CARE framework in Section 3.3 and provides equations for the loss terms, but it does not include a clearly labeled pseudocode block or algorithm with structured steps formatted like code.
Open Source Code	Yes	Code, dataset, and models: https://github.com/Mr Giovanni/CARE
Open Datasets	Yes	Code, dataset, and models: https://github.com/Mr Giovanni/CARE
Dataset Splits	Yes	This dataset can be split into a training set of 3,151 CT scans and a testing set of 1,958 CT scans. For the anatomy-aware CT reconstruction framework CARE (3.3), we report the results of a subset of 36 CT scans (23 in arterial phase and 13 in portal venous phase), and the remaining 25 CT scans (13 in arterial phase and 12 in portal venous phase) were used for the training set.
Hardware Specification	Yes	All of these reconstruction experiments are run on an eight NVIDIA RTX 6000 GPU server, each with 48 GB of memory. All three training stages of CARE (A.3.3, A.3.4 and A.3.5) are done on an eight RTX 8000 GPUs server, each with 48 GB of memory.
Software Dependencies	Yes	We leverage nn U-Net (under Apache-2.0 license) [26] as an anatomy segmentator, trained on more than 3,000 voxel-wise annotated CT scans [33]... We use TIGRE [5, 6] package to generate synthetic projections following previous works [8, 61]. The autoencoder model is initialized with the checkpoints provided by Stable Diffusion v1.5 [42]. Our diffusion model is implemented by the diffusers [51] package with a backbone of Stable Diffusion v1.5 [42].
Experiment Setup	Yes	We set the weight of reconstruction loss and perceptual loss (detailed in D.1) to be λrec = λper = 1. The weight of the KL regularization term is set to β = 1 × 10−6. The autoencoder model is trained on the JHH CT dataset (A.1.2) for 150,000 iterations. We use Adam W optimizer during training. The de-noising UNet is trained on the same JHH dataset as used in the autoencoder training, with Adam W optimizer and 50,000 training iterations. The model is finetuned on 25 CT scans (as mentioned in A.1.1) for 50,000 iterations with the Adam W optimizer for each given CT reconstruction method. We set the weights of the losses to λp = 1 and λs = 0.001.