Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Data-Adaptive Exposure Thresholds under Network Interference
Authors: Vydhourie Thiyageswaran, Tyler H. McCormick, Jennifer Brennan
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present simulations illustrating that our method improves upon non-adaptive threshold choices, and an adapted Lepski s method. We further illustrate the performance of our estimator by running experiments with synthetic outcomes on a real village network dataset, and on a publicly-available Amazon product similarity graph. Furthermore, we demonstrate that our method remains robust to deviations from the linear potential outcomes model. |
| Researcher Affiliation | Collaboration | Vydhourie Thiyageswaran Department of Statistics University of Washington Seattle, WA, USA EMAIL Tyler H. Mc Cormick Department of Statistics University of Washington Seattle, WA, USA EMAIL Jennifer Brennan Google Research Kirkland, WA, USA EMAIL |
| Pseudocode | Yes | Algorithm 1 Ada Thresh Require: Graph adjacency matrix W, outcome vector Y , treatment vector z 1: Compute exposure: e D 1Wz 2: Fit linear model: Y = βz + γe + c 3: Let ˆγ be the estimated coefficient for e 4: for each threshold h H do 5: Estimate bias: d Bias(h) using ˆγ, Y, z, W, and h See Eq. (6) 6: Estimate variance: c Var(h) using Y , z, W, and h See Appendix A.1 7: Compute [ MSE(h) d Bias 2(h) + d Var(h) 8: end for 9: ˆh arg minh H [ MSE(h) 10: return ˆτˆh See (4) |
| Open Source Code | Yes | The code is available at: https://github.com/Vydhourie/AdaThresh.git |
| Open Datasets | Yes | We evaluate the performance of our estimator on village (No.6) network data from [Banerjee et al., 2013]... For larger n and smaller dmax, performance improves further, supporting our theoretical findings, as demonstrated in Appendix A.7 on the Amazon (DVD) products similarity network [Leskovec et al., 2007] (see Figure 6), and on various circulant graphs. ... The graph data is available at: https://snap.stanford.edu/data/amazon-meta.html |
| Dataset Splits | No | The paper does not explicitly describe training/test/validation dataset splits. It describes experimental setups with synthetic outcomes, randomizations (unit-level, cluster-level Bernoulli), and Monte-Carlo trials for exposure probabilities, but not a division of a single dataset into distinct training, validation, and test subsets. |
| Hardware Specification | Yes | All synthetic graph simulations were run on a machine of Intel Xeon processors with 48 CPU cores, and 50GB of RAM. |
| Software Dependencies | No | The paper mentions that code is available and describes the experimental setup but does not specify software dependencies with version numbers (e.g., Python, specific libraries, or solvers with versions). |
| Experiment Setup | Yes | We ran experiments with synthetic potential outcomes, averaging over 1000 trials, under unit-level and cluster-level Bernoulli randomizations... We generate simulated data using the linear model with ψ(zi, ei) = g(zi) + f(ei), αi = 10, g(zi) = βzi = 10zi, f(ei) = γei, with fixed ϵi generated from N(0, 1). To compute the exposure probabilities, we used 2 × 10^4 Monte-Carlo trials. We focus on varying the ratio γ/β as we consider a fixed graph. |