reproducibilityindex.ai

NESTER: An Adaptive Neurosymbolic Method for Causal Effect Estimation

Authors: Abbavaram Gowtham Reddy, Vineeth N Balasubramanian

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our comprehensive empirical results show that NESTER performs better than state-of-the-art methods on benchmark datasets. We perform comprehensive empirical studies on multiple benchmark datasets where NESTER outperforms existing state-of-the-art models. The results shown in Tab 2 show the superior performance of NESTER over existing methods.
Researcher Affiliation	Academia	Abbavaram Gowtham Reddy, Vineeth N Balasubramanian Indian Institute of Technology Hyderabad, India cs19resch11002@iith.ac.in, vineethnb@iith.ac.in
Pseudocode	Yes	We outline our overall algorithm in Algorithm 1 of Appendix B.
Open Source Code	Yes	Our code and instructions to reproduce the results are included in the supplementary material and will be made publicly available.
Open Datasets	Yes	Thus, following (Shalit, Johansson, and Sontag 2017; Yoon, Jordon, and van der Schaar 2018; Shi, Blei, and Veitch 2019; Farajtabar et al. 2020), we experiment on two semi-synthetic datasets Twins (Almond, Chay, and Lee 2005), IHDP (Hill 2011) that are derived from real-world RCTs (see Appendix C for details). We also experiment on one real-world dataset Jobs (La Londe 1986).
Dataset Splits	Yes	For both datasets (IHDP, Twins), following prior works (Shalit, Johansson, and Sontag 2017; Shi, Blei, and Veitch 2019; Yoon, Jordon, and van der Schaar 2018), we use the standard train-test split for 100 random instances of the data. The number of samples for IHDP is 747, and for Twins is 118400. In both cases, the train-test split is 80-20. For Jobs dataset, following (Shalit, Johansson, and Sontag 2017), we consider 10 random train-test splits on the total number of 445 samples with a split ratio of 80-20.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., Python, PyTorch, TensorFlow, etc.) needed to replicate the experiment.
Experiment Setup	Yes	To permit efficient learning (and to some degree, interpretability of the learned program, as discussed in Appendix G), we limit the program depth to utmost 5 for the main experiments. For both NESTER-NEAR and NESTER-d Pads, we use a constant learning rate of 1e-3, and we set the temperature parameter β = 1.0 (for smooth approximation of if-then-else primitive) during training. We set the maximum number of epochs as 250, and we use a mini-batch size of 64.