Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Information-Driven Design of Imaging Systems

Authors: Henry Pinkard, Leyla Kabuli, Eric Markley, Tiffany Chien, Jiantao Jiao, Laura Waller

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our information estimates accurately captured system performance differences across four imaging domains (color photography, radio astronomy, lensless imaging, and microscopy). Systems designed with IDEAL matched the performance of those designed with end-to-end optimization, the prevailing approach that jointly optimizes hardware and image processing algorithms. These results establish mutual information as a universal performance metric for imaging systems that enables both computationally efficient design optimization and evaluation in real-world conditions.
Researcher Affiliation	Academia	Department of Electrical Engineering and Computer Sciences, University of California, Berkeley
Pseudocode	No	The paper describes methods and mathematical frameworks, but does not contain any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code	Yes	A video summary of this work can be found at: https://waller-lab.github.io/Encoding Information Website/ Project website https://waller-lab.github.io/Encoding Information Website/ https://github.com/Waller-Lab/Encoding Information.
Open Datasets	Yes	Color photography. Digital cameras encode color on monochrome sensors using color filter arrays in front of their pixels... Using natural images [62, 63] with simulated photon shot noise, we estimated information content for each design... Lensless imaging... We tested a traditional lens, random microlens array [67], and Gaussian diffuser [68] using natural images with simulated photon shot noise at various light levels. Higher information estimates consistently correlated with better reconstruction accuracy across all designs and noise conditions (Fig. S20). Coded illumination microscopy... Our information estimates correlated with protein prediction accuracy across three illumination patterns, enabling evaluation of microscopy designs without time-consuming and expensive protein labeling experiments (Fig. 2d). We tested brightfield, differential phase contrast [73, 74], and single-LED illumination measurements of white blood cells [75] with added simulated photon shot noise to equalize photon counts. S6.5.1 Color imaging dataset We conducted experiments using the Gehler-Shi dataset [62, 63] (implied license via explicit permission to use), which comprises 568 high-quality natural images. S6.5.3 Natural image dataset The CIFAR10 dataset [153] (implied license via permission to use) was used for the lensless imaging experiments. S3.3 Failures of stationary Gaussian estimates on highly non-Gaussian data... MNIST handwritten digits dataset [116] (GNU General Public License) S6.5.4 Cell imaging dataset We analyzed single leukocyte images and corresponding protein expression measurements from the Berkeley Single Cell Computational Microscopy (BSCCM) dataset [75] (CC0 1.0 Universal license).
Dataset Splits	Yes	S6.5.1 Color Filter Array: ...The dataset was partitioned into 461 training, 51 validation, and 56 test images, with each image subdivided into 24 24 pixel patches. S6.6 Information-Driven Encoder Analysis Learning (IDEAL): Using the Gehler-Shi dataset [62, 63] partitioned into training (461 images), validation (51 images), and test (56 images) sets, we extracted 100,000 patches of size 24 24 pixels for training, along with 10,000 patches from the validation set.
Hardware Specification	Yes	Table 1: Training times measured on 20 20 patches using an NVIDIA RTX A6000 GPU. S3.1 Fitting stationary Gaussian processes: ...taking 3 seconds to complete on an NVIDIA Ge Force RTX 3090 GPU. S6.5.1 Color Filter Array: ...All replicates were trained in 5 hours on an Nvidia RTX A6000 GPU. S6.5.3 Image classification: ...trained on an NVIDIA TITAN XP GPU using <5GB memory, 10 minutes training time per model. S6.5.4 LED array microscopy: ...Typical trainings took 1-2 days on NVIDIA TITAN Xp GPUs with 12GB memory.
Software Dependencies	No	S6.1 Model training: All models were implemented in JAX/Flax for efficient training on GPUs. (No version numbers are provided for JAX/Flax or other libraries.)
Experiment Setup	Yes	S6.5.1 Color Filter Array: Neural network architecture and training Color image reconstruction was implemented using a bifurcated network architecture... Training used the Adam optimizer (β1 = 0.9, β2 = 0.999) with a learning rate of 1 10 5 and a batch size of 128 patches, running for a maximum of 100,000 training steps with checkpoints every 5, 000 steps. S6.6 Information-Driven Encoder Analysis Learning (IDEAL): ...optimized using the Adam W optimizer (learning rate = 10 4, β1 = 0.9, β2 = 0.999) until convergence. S6.5.4 LED array microscopy: ...Training used the Adam optimizer with a learning rate of 5 10 5 and batches of 16 images.