Cause-Effect Inference in Location-Scale Noise Models: Maximum Likelihood vs. Independence Testing
Authors: Xiangyu Sun, Oliver Schulte
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | However, through an extensive empirical evaluation, we demonstrate that the accuracy deteriorates sharply when the form of the noise distribution is misspecified by the user. Our analysis shows that the failure occurs mainly when the conditional variance in the anti-causal direction is smaller than that in the causal direction. As an alternative, we find that causal model selection through residual independence testing is much more robust to noise misspecification and misleading conditional variance. |
| Researcher Affiliation | Academia | Xiangyu Sun Simon Fraser University xiangyu_sun@sfu.ca Oliver Schulte Simon Fraser University oschulte@cs.sfu.ca |
| Pseudocode | Yes | Algorithm 1 CAREFL-H ... Algorithm 2 CAREFL-M ... Algorithm 3 CAREFL-H (Between Residuals) |
| Open Source Code | Yes | The code and scripts to reproduce all the results are given online 1. 1https://github.com/xiangyu-sun-789/CAREFL-H |
| Open Datasets | Yes | Experiments with 580 synthetic and 99 real-world datasets are given in Section 7. The code and scripts to reproduce all the results are given online 1. ... We compare CAREFL-M and CAREFL-H against the SIM benchmark suite [14]. ... The Tübingen Cause-Effect Pairs benchmark [14] is commonly used to evaluate cause-effect inference algorithms [11, 30, 9]. |
| Dataset Splits | Yes | We use both splitting methods: (i) CAREFL(0.8): 80% as training and 20% as testing. (ii) CAREFL(1.0): training = testing = 100%. |
| Hardware Specification | Yes | The running time is measured on a computer running Ubuntu 20.04.5 LTS with Intel Core i7-6850K 3.60GHz CPU and 32 GB memory. No GPUs are used. |
| Software Dependencies | No | The paper mentions 'Adam optimizer [12]' but does not provide specific version numbers for programming languages (e.g., Python), frameworks (e.g., PyTorch, TensorFlow), or other libraries used for implementation. |
| Experiment Setup | Yes | The flow estimator T is parameterized with 4 sub-flows (alternatively: 1, 7 and 10). For each sub-flow, f, g, h and k are modelled as four-layer MLPs with 5 hidden neurons in each layer (alternatively: 2, 10 and 20). Prior distribution is Laplace (alternatively: Gaussian prior). Adam optimizer [12] is used to train each model for 750 epochs (alternatively: 500, 1000 and 2000). L2-penalty strength is 0 by default (alternatively: 0.0001, 0.001, 0.1). |