Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ABC3: Active Bayesian Causal Inference with Cohn Criteria in Randomized Experiments

Authors: Taehun Cha, Donghun Lee

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments on real-world data sets, ABC3 achieves the highest efficiency, while empirically showing the theoretical results hold. In this section, we empirically analyze the theoretical results introduced in Section 4. For the comparison, we utilize IHDP (Brooks-Gunn, Liaw, and Klebanov (1992) and Hill (2011)), Boston (Harrison and Rubinfeld 1978), ACIC (Gruber et al. 2019), and Lalonde (La Londe 1986) data sets.
Researcher Affiliation Academia Taehun Cha and Donghun Lee* Korea University EMAIL
Pseudocode Yes Algorithm 1: ABC3 Input: Current time step t, whole covariate set XΩ, covariates distribution P, previous observations X1 t and X0 t , kernel k, noise parameter σϵ Output: xt+1, at+1 1: V0, V1 = ϕ, ϕ 2: for x XΩ\ (X0 t S X1 t ) do 3: Compute k1 t+1 and k0 t+1 assuming xt+1 = x 4: v0, v1 = Equation (1) for each a {0, 1} 5: V0 = V0 S{v0}, V1 = V1 S{v1} 6: end for 7: i, at+1 = arg max(V0||V1) 8: xt+1 = XΩ\ (X0 t S X1 t ) [i] 9: return xt+1, at+1
Open Source Code Yes Code https://github.com/AIML-K/Active Bayesian Causal/
Open Datasets Yes For the comparison, we utilize IHDP (Brooks-Gunn, Liaw, and Klebanov (1992) and Hill (2011)), Boston (Harrison and Rubinfeld 1978), ACIC (Gruber et al. 2019), and Lalonde (La Londe 1986) data sets.
Dataset Splits Yes We randomly divide each data set in half for every trial to construct train and test data sets.
Hardware Specification No No specific hardware details (like GPU/CPU models or specific machine configurations) are provided in the paper. The paper only mentions that "most policies require less than 1 second to sample the whole data set" which refers to performance, not specific hardware.
Software Dependencies No The paper mentions "We optimize the kernel hyperparameters using scikit-learn package." but does not provide a specific version number for scikit-learn or any other software dependencies.
Experiment Setup Yes We apply feature-wise normalization and y-standardization for all regressors. (except Leverage, which requires item-wise normalization) We fit two Gaussian process models with Constant Kernel * Radial Basis Function (RBF) Kernel + White Kernel. We optimize the kernel hyperparameters using scikit-learn package. All the uncertainty-aware policies (ABC3, Mackay and ACE) use the Gaussian process to quantify the uncertainty. For the uncertainty-quantifying kernels, we use RBF kernel with length scale 1.0 with σ2 ϵ = 1. (We check the hyperparameter sensitivity in Section 5.4.)