Interpretable Models for Granger Causality Using Self-explaining Neural Networks
Authors: Ričards Marcinkevičs, Julia E Vogt
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In comprehensive experiments on simulated data, we show that our framework performs on par with several powerful baseline methods at inferring Granger causality and that it achieves better performance at inferring interaction signs. The results suggest that our framework is a viable and more interpretable alternative to sparse-input neural networks for inferring Granger causality. |
| Researcher Affiliation | Academia | Riˇcards Marcinkeviˇcs Department of Computer Science ETH Z urich Universit atstrasse 6 8092 Z urich, Switzerland ricards.marcinkevics@inf.ethz.ch Julia E. Vogt Department of Computer Science ETH Z urich Universit atstrasse 6 8092 Z urich, Switzerland julia.vogt@inf.ethz.ch |
| Pseudocode | Yes | Algorithm 1 summarises the proposed stability-based thresholding procedure. During inference, two separate GVAR models are trained: one on the original time series data, and another on time-reversed data (lines 3-4 in Algorithm 1). Consequently, we estimate strengths of GC relationships with these two models, as in Equation 7, and choose a threshold for matrix S which yields the highest agreement between thresholded GC strengths estimated on original and time-reversed data (lines 5-9 in Algorithm 1). |
| Open Source Code | Yes | The code is available in the Git Hub repository: https://github.com/i6092467/GVAR. |
| Open Datasets | Yes | A standard benchmark for the evaluation of GC inference techniques is the Lorenz 96 model (Lorenz, 1995). This continuous time dynamical system in p variables is given by the following nonlinear differential equations: dt = xi+1 xi 2 xi 1 xi + F, for 1 i p, (8) Another dataset we consider consists of rich and realistic simulations of blood-oxygen-leveldependent (BOLD) time series (Smith et al., 2011) that were generated using the dynamic causal modelling functional magnetic resonance imaging (f MRI) forward model. To this end, we consider the Lotka Volterra model with multiple species Baca er (2011) provides a definition of the original two-species system , given by the following differential equations: |
| Dataset Splits | No | The paper does not explicitly describe separate training, validation, and test dataset splits with specific percentages or counts. While hyperparameter tuning implies the use of a validation set, the text mentions tuning on data that is 'held out to perform prediction', which is typically the test set, and does not define a distinct validation split. |
| Hardware Specification | Yes | This experiment was performed on an Intel Core i7-7500U CPU (2.70 GHz 4) with a Ge Force GTX 950M GPU. |
| Software Dependencies | Yes | We use R (R Core Team, 2020) package dbn R (Quesada, 2020) to fit DBNs on all datasets considered in Section 4. We use two structure learning algorithms: the max-min hill-climbing (MMHC) (Tsamardinos et al., 2006) and the particle swarm optimisation (Xing-Chen et al., 2007). Table 9 contains average balanced accuracies achieved by DBNs and GVAR for inferring the GC structure. |
| Experiment Setup | Yes | Relevant hyperparameters of all models are tuned to maximise the BA score or AUPRC (if a model fails to shrink any weights to zeros) by performing a grid search (see Appendix H for details about hyperparameter tuning). All models were trained for 1000 epochs with a mini-batch size of 64. In each dataset, the same numbers of hidden layers and hidden units were used across all models. When applicable, models were restricted to the same order (K). |