Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Online Experimental Design With Estimation-Regret Trade-off Under Network Interference

Authors: Zhiheng Zhang, Zichen Wang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The simulation results are provided in the Appendix E to validate its effectiveness. ... E Experiments Setup. We consider a network consisting of 101 units. ... Results. The simulation results are shown in Fig. 3(a) and Fig. 3(b). As seen in Fig. 3(a), both the Standard method and UCB-TSN achieve the lowest cumulative regret, while Uniform exhibits the highest cumulative regret. Fig. 3(b) presents a box plot of the maximum ATE estimation error, eʹ(T, ^ ), where the green line represents the median.
Researcher Affiliation	Academia	Zhiheng Zhang School of Statistics and Data Science, Shanghai University of Finance and Economics, Shanghai 200433, P.R. China Institute of Data Science and Statistics, Shanghai University of Finance and Economics, Shanghai 200433, P.R. China Zichen Wang Department of ECE and CSL UIUC
Pseudocode	Yes	The pseudo code is provided in the appendix due to the space limitation. ... J Algorithm UCB-Two Stage-Network ... Algorithm 1 UCB-Two Stage-Network (UCB-TSN) ... Algorithm 2 Sampling
Open Source Code	Yes	Our code is available at: https://github.com/ZHzhang01/Neur IPS2025-Online-ABtest.
Open Datasets	No	The paper describes a simulation environment (
Dataset Splits	No	The paper describes a simulation setup, not a dataset with predefined splits. It specifies 'a network consisting of 101 units' and 'Each algorithm is executed 1000 times', which refers to simulation runs rather than dataset splits.
Hardware Specification	No	The paper does not mention any specific hardware used for running the simulations.
Software Dependencies	No	The paper does not list any specific software dependencies or their versions.
Experiment Setup	Yes	Setup. We consider a network consisting of 101 units. Specifically, there is a central cluster C1 = {1} that contains a single unit, which is connected to every unit in the five peripheral clusters C2, . . . , C6 (namely, C2 = {2, . . . , 21}, C3 = {22, . . . , 41}, C4 = {42, . . . , 61}, C5 = {62, . . . , 81}, and C6 = {82, . . . , 101}, with each outer cluster containing 20 units, as shown in Fig. 2). We set the action set as K = {0, 1}. ... We evaluate the performance of UCB-TSN (T1 = p \|UE\| T) against two baseline methods: Standard (i.e., UCB-TSN with T1 = 0) and Uniform (i.e., UCB-TSN with T1 = T). Each algorithm is executed 1000 times, and we report the averaged results.