Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Doubly Robust Counterfactual Classification
Authors: Kwangho Kim, Edward Kennedy, Jose Zubizarreta
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We study the empirical performance of our methods by simulation and apply them for recidivism risk prediction. |
| Researcher Affiliation | Academia | Kwangho Kim Harvard Medical School EMAIL Edward H. Kennedy Carnegie Mellon University EMAIL José R. Zubizarreta Harvard University EMAIL |
| Pseudocode | Yes | Algorithm 1: Doubly robust estimator for counterfactual classification |
| Open Source Code | No | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No] |
| Open Datasets | Yes | Next we apply our method for recidivism risk prediction using the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) dataset 2. https://github.com/propublica/compas-analysis |
| Dataset Splits | Yes | We use sample splitting as described in Algorithm 1 with K = 2 splits. |
| Hardware Specification | No | Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A] |
| Software Dependencies | No | The paper mentions 'SUPERLEARNER R package' and 'NLOPTR R package' as well as 'Sto Go' and 'BOBYQA' algorithms, but it does not specify version numbers for these software components. |
| Experiment Setup | Yes | For nuisance estimation we use the cross-validation-based Super Learner ensemble via the SUPERLEARNER R package to combine generalized additive models, multivariate adaptive regression splines, and random forests. We use sample splitting as described in Algorithm 1 with K = 2 splits... To solve b P, we first use the Sto Go algorithm [40] via the NLOPTR R package as it has shown the best performance in terms of accuracy in the survey study of [35]. After running the Sto Go, we then use the global optimum as a starting point for the BOBYQA local optimization algorithm [41] to further polish the optimum to a greater accuracy. We use sample sizes n = 1k, 2.5k, 5k, 7.5k, 10k and repeat the simulation 100 times for each n. |