Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Cluster-Adaptive Network A/B Testing: From Randomization to Estimation

Authors: Yang Liu, Yifan Zhou, Ping Li, Feifang Hu

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive numerical studies are conducted to demonstrate the finite sample property of the proposed network A/B testing procedure. Numerical studies with a hypothetical network and a real data example are conducted in Section 5 and Section 6 to demonstrate the finite sample properties of our proposed procedure.
Researcher Affiliation Collaboration Yang Liu EMAIL Institute of Statistics and Big Data Renmin University of China Beijing, 100872, China Yifan Zhou EMAIL Department of Statistics George Washington University Washington, DC 22202, USA Ping Li EMAIL Vec ML Inc. Bellevue, WA 98004, USA Feifang Hu EMAIL Department of Statistics George Washington University Washington, DC 22202, USA
Pseudocode Yes Algorithm 1 Cluster-Adaptive Randomization (CLAR) 1: Input: baseline covariates {ξj}m j=1; probability of the biased coin 1/2 < ρ < 1; 2: Compute Sm based on {ξj}m j=1; 3: Assign Z1 Bernoulli(1/2) and set Z2 = 1 Z1; 4: for j = 2 to m/2 do
Open Source Code No The paper does not contain any explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets Yes This dataset is available at https://www.icpsr.umich.edu/web/civicleads/studies/37070/versions/V2.
Dataset Splits No The paper describes generating hypothetical networks and using a real-world social network dataset, but it does not specify any training/test/validation splits for the data. For example, in Section 5, it mentions 'We conduct 100 experiments with 100 networks, each containing m clusters, for m {50, 100, 200}' which describes data generation parameters, not experimental splits.
Hardware Specification No The acknowledgments section mentions support from "Public Computing Cloud, Renmin University of China." However, this refers to a general computing environment rather than specific hardware details like GPU/CPU models or processor types with memory details, which are necessary for hardware specification.
Software Dependencies No The paper does not provide specific software dependencies, such as library names with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1), that would be necessary to replicate the experiment.
Experiment Setup Yes The following model is assumed to generate the outcome: Yi = Tiµ1 + (1 Ti)µ0 + α1d 1 i X k Ni Tk + α0d 1 i X k Ni (1 Tk) + Pj=1 Xj,CLβCLI{i Cj} + Xi,INβIN + ϵi, where µ1 = 2 and µ0 = 1 are the direct effects and α1 = 2 and α0 = 1 are the spillover effects. Therefore, the ATE is τ(1, 0) = µ1 µ0 +α1 α0 = 2. The cluster-level covariates Xj,CL = (Xj,1,CL, Xj,2,CL) and the individual-level covariates Xi,IN = (Xi,1,IN, Xi,2,IN, Xi,3,IN) are generated as follows: Xj,1,CL, the scaled cluster size cj/E[cj]; Xj,2,CL, the density of the j-th cluster; Xi,1,IN, the indicator of being an outer node; Xi,2,IN, the number of edges connecting with nodes in Cj if i Cj and j = j ; and Xi,3,IN, the number of edges connecting nodes in Cj if i Cj. The associated effects of the cluster-level and individual-level covariates are βCL = (1, 0.8) and βIN = (1, 0.5, 0.5) . The random errors ϵi are i.i.d. N(0, 22) and are independent of {Xj,CL}m j=1 and {Xi,IN}n i=1. We conduct 100 experiments with 100 networks, each containing m clusters, for m {50, 100, 200}. ... All of the simulation studies are based on 10,000 replications.