Meta Learning for Causal Direction
Authors: Jean-François Ton, Dino Sejdinovic, Kenji Fukumizu9897-9905
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate our method on various synthetic as well as real-world data and show that it is able to maintain high accuracy in detecting directions across varying dataset sizes. ... Experiments ... Synthetic Datasets ... Tuebingen Cause-Effect dataset ... We measure the performance of distinguishing the causal direction using the Area Under the Precision Recall Curve (AUPRC) |
| Researcher Affiliation | Academia | Jean-Franc ois Ton,1 Dino Sejdinovic, 1 Kenji Fukumizu 2 1 University of Oxford 2 The Institute of Statistical Mathematics ton@stats.ox.ac.uk, dino.sejdinovic@sejdinovic@ox.ac.uk, fukumizu@ism.ac.jp |
| Pseudocode | Yes | See Figure 1 and Algorithm 1 (Appendix) for a detailed breakdown of meta-CGNN. |
| Open Source Code | No | The paper mentions, "We use the implementation by (Kalainathan and Goudet 2020) which provides a Git Hub repository toolbox for the above mentioned methods." This refers to the code for *other* methods they are comparing against, not the open-source code for their *own* proposed meta-CGNN method. |
| Open Datasets | Yes | For the synthetic experiments we use three different types of datasets taken from Goudet et al. (2017)... As a real-world example, we use the popular Tuebingen benchmark (Mooij et al. 2016)... |
| Dataset Splits | Yes | For meta-CGNN, we use 100 datasets for training and the remaining 200 for testing. ... We employ 5-fold cross-validation for training and testing... we solely cross-validated over the number of decoder nodes [5, 40] (Goudet et al. 2017) by leaving out a few datasets at training aside for validation. |
| Hardware Specification | No | The paper mentions that CGNN requires "high-performance computing environments" and uses "GPU" (Table 1), but it does not provide any specific details about the hardware used for their experiments, such as specific GPU models, CPU types, or memory amounts. |
| Software Dependencies | No | The paper mentions using a "Gaussian kernel", an "Adam optimizer (Kingma and Ba 2014)", and refers to an "implementation by (Kalainathan and Goudet 2020) which provides a Git Hub repository toolbox". However, it does not provide specific version numbers for any of these software components, which is necessary for reproducibility. |
| Experiment Setup | Yes | Regarding architectures, we use 2 hidden layers with Re LU activation function for Fi LM, amortization and encoder network. For the decoder we use a 1 hidden layer with Re LU. ... we solely cross-validated over the number of decoder nodes [5, 40] ... For our loss function, we use a sum over Gaussian kernel MMDs with bandwidth η {0.005, 0.05, 0.25, 0.5, 1, 5, 50} together with an Adam optimizer (Kingma and Ba 2014) and a fixed learning rate 0.01. For the mini-batch size we fix q to 10. |