NESTER: An Adaptive Neurosymbolic Method for Causal Effect Estimation

Authors: Abbavaram Gowtham Reddy, Vineeth N Balasubramanian

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our comprehensive empirical results show that NESTER performs better than state-of-the-art methods on benchmark datasets. We perform comprehensive empirical studies on multiple benchmark datasets where NESTER outperforms existing state-of-the-art models. The results shown in Tab 2 show the superior performance of NESTER over existing methods.
Researcher Affiliation Academia Abbavaram Gowtham Reddy, Vineeth N Balasubramanian Indian Institute of Technology Hyderabad, India cs19resch11002@iith.ac.in, vineethnb@iith.ac.in
Pseudocode Yes We outline our overall algorithm in Algorithm 1 of Appendix B.
Open Source Code Yes Our code and instructions to reproduce the results are included in the supplementary material and will be made publicly available.
Open Datasets Yes Thus, following (Shalit, Johansson, and Sontag 2017; Yoon, Jordon, and van der Schaar 2018; Shi, Blei, and Veitch 2019; Farajtabar et al. 2020), we experiment on two semi-synthetic datasets Twins (Almond, Chay, and Lee 2005), IHDP (Hill 2011) that are derived from real-world RCTs (see Appendix C for details). We also experiment on one real-world dataset Jobs (La Londe 1986).
Dataset Splits Yes For both datasets (IHDP, Twins), following prior works (Shalit, Johansson, and Sontag 2017; Shi, Blei, and Veitch 2019; Yoon, Jordon, and van der Schaar 2018), we use the standard train-test split for 100 random instances of the data. The number of samples for IHDP is 747, and for Twins is 118400. In both cases, the train-test split is 80-20. For Jobs dataset, following (Shalit, Johansson, and Sontag 2017), we consider 10 random train-test splits on the total number of 445 samples with a split ratio of 80-20.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., Python, PyTorch, TensorFlow, etc.) needed to replicate the experiment.
Experiment Setup Yes To permit efficient learning (and to some degree, interpretability of the learned program, as discussed in Appendix G), we limit the program depth to utmost 5 for the main experiments. For both NESTER-NEAR and NESTER-d Pads, we use a constant learning rate of 1e-3, and we set the temperature parameter β = 1.0 (for smooth approximation of if-then-else primitive) during training. We set the maximum number of epochs as 250, and we use a mini-batch size of 64.