Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Using representation balancing to learn conditional-average dose responses from clustered data

Authors: Christopher Bockel-Rickermann, Toon Vanderschueren, Jeroen Berrevoets, Tim Verdonck, Wouter Verbeke

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We run extensive experiments to illustrate the workings of our method and compare it with the state of the art in ML for CADR estimation. On a novel benchmarking dataset, we show the impacts of clustered data on model performance. Additionally, we propose an estimator, CBRNet, that enables the application of representation balancing for CADR estimation through clustering the covariate space and a novel loss function. CBRNet learns cluster-agnostic and hence dose-agnostic covariate representations for unbiased CADR inference. We run extensive experiments to illustrate the workings of our method and compare it with the state of the art in ML for CADR estimation. ... Section 5 Experimental Evaluation ... Section 6 Empirical Results
Researcher Affiliation	Academia	Christopher Bockel-Rickermann EMAIL KU Leuven; Toon Vanderschueren EMAIL KU Leuven University of Antwerp; Jeroen Berrevoets EMAIL University of Cambridge; Tim Verdonck EMAIL University of Antwerp imec KU Leuven; Wouter Verbeke EMAIL KU Leuven
Pseudocode	No	The paper describes the architecture of CBRNet and its training process in Section 4, including the loss function and components. However, it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block with structured, step-by-step instructions formatted like code.
Open Source Code	Yes	Our implementation of CBRNet is available online for practitioners and fellow researchers to build upon (cf. Appendix E). Our implementation uses a basic scikit-learn syntax (Pedregosa et al., 2011), enabling efficient model setup, training and inference: 1 from src.methods.neural import CBRNet 2 model = CBRNet () 3 model.fit(X_train , Y_train , S_train) 4 model.predict(X_test , S_test) The code to reproduce all experiments, results, and figures paper can be found online via https://github.com/christopher-br/ CBRNet.
Open Datasets	Yes	We evaluate CBRNet empirically by comparing it to several benchmarking methods on a novel semi-synthetic dataset, the Dry bean-DR data . The following paragraphs will discuss the creation of this data, the benchmarking methods, and the metrics used for evaluation. The dataset is available publically and has been provided with the code for this manuscript. ... Our data generation starts by taking the covariates of the dry bean dataset (Koklu & Ozkan, 2020). ... We run experiments on previously established benchmarking datasets proposed by Bica et al. (2020)2 and Nie et al. (2021).
Dataset Splits	Yes	For a total of 17 different combinations of α and β, we generate 10 random instances of the Dry bean-DR data. We use 70% of the data for training, 10% as a validation set for hyperparameter tuning, and 20% as a test set for calculating performance metrics.
Hardware Specification	Yes	Our experiments are written in Python 3.10 (Van Rossum et al., 1995) and were executed on an Apple M2 Pro So C with 10 CPU cores, 16 GPU cores, and 16 GB of shared memory.
Software Dependencies	No	The paper mentions 'Python 3.10' with its version. However, for other key software components like 'Py Torch', 'Lightning', 'xgboost', 'Scikit-Learn', and 'statsmodels', it only provides citations to their respective papers (e.g., 'Py Torch (Paszke et al., 2017)') rather than specific version numbers used in the implementation.
Experiment Setup	Yes	Hyperparameter optimization. Results are not to be compared to the original papers, as the optimization scheme and parameter search ranges differ from the original records. If not specified differently, the remaining hyperparameters are set to match the specifications of the original authors. Table 7: Hyperparameter search range for Linear Regression:... Table 8: Hyperparameter search range for CART:... Table 9: Hyperparameter search range for xgboost:... Table 10: Hyperparameter search range for MLP:... Table 11: Hyperparameter search range for DRNet:... Table 12: Hyperparameter search range for VCNet:... Table 13: Hyperparameter search range for CBRNet:...