CASTLE: Regularization via Auxiliary Causal Graph Discovery
Authors: Trent Kyono, Yao Zhang, Mihaela van der Schaar
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide a theoretical generalization bound for our approach and conduct experiments on a plethora of synthetic and real publicly available datasets demonstrating that CASTLE consistently leads to better out-of-sample predictions as compared to other popular benchmark regularizers. |
| Researcher Affiliation | Academia | Trent Kyono University of California, Los Angeles tmkyono@ucla.edu Yao Zhang University of Cambridge yz555@cam.ac.uk Mihaela van der Schaar University of Cambridge University of California, Los Angeles The Alan Turing Institute mv472@cam.ac.uk |
| Pseudocode | Yes | We provide further details on our synthetic DGP and pseudocode in Appendix B. |
| Open Source Code | Yes | Code is provided at https://bitbucket.org/mvdschaar/mlforhealthlabpub. |
| Open Datasets | Yes | We perform regression and classification experiments on a spectrum of publicly available datasets from [50] including Boston Housing (BH), Wine Quality (WQ), Facebook Metrics (FB), Bioconcentration (BC), Student Performance (SP), Community (CM), Contraception Choice (CC), Pima Diabetes (PD), Las Vegas Ratings (LV), Statlog Heart (SH), and Retinopathy (RP). [50] Dheeru Dua and Casey Graff. UCI machine learning repository, 2020. |
| Dataset Splits | Yes | For each dataset, we randomly reserve 20% of the samples for a testing set. We perform 10-fold cross-validation on the remaining 80%. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | We implemented CASTLE in Tensorflow2. The paper mentions Tensorflow2 but does not provide version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | Each model is trained using the Adam optimizer with a learning rate of 0.001 for up to a maximum of 200 epochs. An early stopping regime halts training with a patience of 30 epochs. |