Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Using representation balancing to learn conditional-average dose responses from clustered data

Authors: Christopher Bockel-Rickermann, Toon Vanderschueren, Jeroen Berrevoets, Tim Verdonck, Wouter Verbeke

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We run extensive experiments to illustrate the workings of our method and compare it with the state of the art in ML for CADR estimation. On a novel benchmarking dataset, we show the impacts of clustered data on model performance. Additionally, we propose an estimator, CBRNet, that enables the application of representation balancing for CADR estimation through clustering the covariate space and a novel loss function. CBRNet learns cluster-agnostic and hence dose-agnostic covariate representations for unbiased CADR inference. We run extensive experiments to illustrate the workings of our method and compare it with the state of the art in ML for CADR estimation. ... Section 5 Experimental Evaluation ... Section 6 Empirical Results
Researcher Affiliation Academia Christopher Bockel-Rickermann EMAIL KU Leuven; Toon Vanderschueren EMAIL KU Leuven University of Antwerp; Jeroen Berrevoets EMAIL University of Cambridge; Tim Verdonck EMAIL University of Antwerp imec KU Leuven; Wouter Verbeke EMAIL KU Leuven
Pseudocode No The paper describes the architecture of CBRNet and its training process in Section 4, including the loss function and components. However, it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block with structured, step-by-step instructions formatted like code.
Open Source Code Yes Our implementation of CBRNet is available online for practitioners and fellow researchers to build upon (cf. Appendix E). Our implementation uses a basic scikit-learn syntax (Pedregosa et al., 2011), enabling efficient model setup, training and inference: 1 from src.methods.neural import CBRNet 2 model = CBRNet () 3 model.fit(X_train , Y_train , S_train) 4 model.predict(X_test , S_test) The code to reproduce all experiments, results, and figures paper can be found online via https://github.com/christopher-br/ CBRNet.
Open Datasets Yes We evaluate CBRNet empirically by comparing it to several benchmarking methods on a novel semi-synthetic dataset, the Dry bean-DR data . The following paragraphs will discuss the creation of this data, the benchmarking methods, and the metrics used for evaluation. The dataset is available publically and has been provided with the code for this manuscript. ... Our data generation starts by taking the covariates of the dry bean dataset (Koklu & Ozkan, 2020). ... We run experiments on previously established benchmarking datasets proposed by Bica et al. (2020)2 and Nie et al. (2021).
Dataset Splits Yes For a total of 17 different combinations of α and β, we generate 10 random instances of the Dry bean-DR data. We use 70% of the data for training, 10% as a validation set for hyperparameter tuning, and 20% as a test set for calculating performance metrics.
Hardware Specification Yes Our experiments are written in Python 3.10 (Van Rossum et al., 1995) and were executed on an Apple M2 Pro So C with 10 CPU cores, 16 GPU cores, and 16 GB of shared memory.
Software Dependencies No The paper mentions 'Python 3.10' with its version. However, for other key software components like 'Py Torch', 'Lightning', 'xgboost', 'Scikit-Learn', and 'statsmodels', it only provides citations to their respective papers (e.g., 'Py Torch (Paszke et al., 2017)') rather than specific version numbers used in the implementation.
Experiment Setup Yes Hyperparameter optimization. Results are not to be compared to the original papers, as the optimization scheme and parameter search ranges differ from the original records. If not specified differently, the remaining hyperparameters are set to match the specifications of the original authors. Table 7: Hyperparameter search range for Linear Regression:... Table 8: Hyperparameter search range for CART:... Table 9: Hyperparameter search range for xgboost:... Table 10: Hyperparameter search range for MLP:... Table 11: Hyperparameter search range for DRNet:... Table 12: Hyperparameter search range for VCNet:... Table 13: Hyperparameter search range for CBRNet:...