Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Robust Integrated Learning and Pauli Noise Mitigation for Parametrized Quantum Circuits

Authors: Md Mobasshir Arshed Naved, Wenbo Xie, Wojciech Szpankowski, Ananth Grama

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Numerical Results 4.1 Experimental Setup and Evaluation We evaluate our learning algorithm (Algorithm 1) on a binary classification task on the standard MNIST dataset as in prior related efforts [3, 30], focusing on digits 3 and 6. Figure 4: Training performance of the binary classification using MNIST dataset in the quantum noise context. Each epoch summarizes 50 iterations of the training process. Table 1: Classification accuracies across Different methods
Researcher Affiliation	Academia	Md Mobasshir Arshed Naved Department of Computer Science Purdue University West Lafayette, USA EMAIL Wenbo Xie Department of Computer Science Purdue University West Lafayette, USA EMAIL Wojciech Szpankowski Department of Computer Science Purdue University West Lafayette, USA EMAIL Ananth Grama Department of Computer Science Purdue University West Lafayette, USA EMAIL
Pseudocode	Yes	Algorithm 1 Learning Algorithm Algorithm 2 Universal Estimation Algorithm (universal_estimator) Algorithm 3 Estimation of (yt tr(MUR(ρt)))2 σj,q (sigma_grad_est) Algorithm 4 Estimation of (yt tr(MUR(ρt)))2 θj (theta_grad_est) Algorithm 5 Gradient Estimator (gradient_estimator)
Open Source Code	Yes	4. Experimental result reproducibility Question: Does the paper fully disclose all the information needed to reproduce the main experimental results of the paper to the extent that it affects the main claims and/or conclusions of the paper (regardless of whether the code and data are provided or not)? Answer: [Yes] Justification: Source code with reproduction instruction is provided as supplementary materials. 5. Open access to data and code Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: The source code is made available along with the submission.
Open Datasets	Yes	We evaluate our learning algorithm (Algorithm 1) on a binary classification task on the standard MNIST dataset as in prior related efforts [3, 30], focusing on digits 3 and 6. Additional results. We repeat the same protocol on the Fashion-MNIST dataset (Pullover vs Shirt) [32]; trends mirror MNIST under static (Figure 5a) and dynamic (Figure 5b) noise, and probabilistic subsampling behaves similarly (Figure 5c).
Dataset Splits	Yes	A total of 5000 input-label pairs are generated, which are randomly shuffled and split into 80% for training and 20% for testing. (for MNIST) We evaluate our approach on the Fashion-MNIST dataset [32] (Pullover vs. Shirt; 5000 training samples; standard test split; three seeds).
Hardware Specification	No	The framework is simulated on a multi-core CPU computing cluster, and the simulation is implemented using Qiskit [10].
Software Dependencies	No	The framework is simulated on a multi-core CPU computing cluster, and the simulation is implemented using Qiskit [10].
Experiment Setup	Yes	For the nm PQC, we employ a noisy 6-qubit hardware-efficient ansatz (HEA), as illustrated in Figure 3, consisting of two layers of parameterized single-qubit rotation gates RX, RY , and RZ, combined with circular entangling CNOT gates as the base PQC. At the end of the circuit, all qubits are measured in Z-basis. We determine the initial learning rate, η(1), via a coarse-to-fine search: a wide log-spaced grid sweep to localize a promising interval, followed by a binary search within that interval to refine the value. During training, we maintain a monotonically decreasing learning-rate schedule. For training we use proximal SGD to optimize the model parameters along with the inverse noise parameter(for our method). Model performance is evaluated by accuracy, while noise mitigation efficacy is assessed using mean squared error (MSE) over training epochs. Each experiment is repeated three times per setting, and the standard deviation is used to report variability.