Temporal Dependencies in Feature Importance for Time Series Prediction
Authors: Kin Kwan Leung, Clayton Rooke, Jonathan Smith, Saba Zuberi, Maksims Volkovs
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct an extensive empirical study on synthetic and real-world data, compare against a wide range of leading explainability methods, and explore the impact of various evaluation strategies. Our results show that Win IT achieves significant gains over existing methods, with more consistent performance across different evaluation metrics. |
| Researcher Affiliation | Collaboration | Kin Kwan Leung Layer 6 AI Clayton Rooke Univ. Waterloo Jonathan Smith Meta Saba Zuberi Layer 6 AI Maksims Volkovs Layer 6 AI |
| Pseudocode | Yes | Algorithm 1 Win IT |
| Open Source Code | Yes | The code for our work is publicly available at https://github.com/layer6ai-labs/Win IT, which includes the detailed settings for experiments. |
| Open Datasets | Yes | MIMIC-III is a multivariate clinical time series dataset with a range of vital and lab measurements taken over time for around 40,000 patients at the Beth Israel Deaconess Medical Center in Boston, MA (Johnson et al., 2016). Dataset Spike is a benchmark experiment presented in Tonekaboni et al. (2020). |
| Dataset Splits | Yes | All evaluations are conducted over 5-fold cross-validation and averaged. We measure stability by splitting the training set into 5 folds and report results averaged across the folds with corresponding standard deviation error bars. |
| Hardware Specification | Yes | All experiments were performed with 40 Intel Xeon CPU@2.20GHz cores and Nvidia Titan V GPU. |
| Software Dependencies | No | The paper mentions software components like GRU models, Adam optimizer, and the Captum library, but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For all experiments we use 1-layer GRU models for f with 200 hidden units. We use Adam optimizer with learning rate 10 3 (10 4 for MIMIC-III) and weight decay 10 3 to train the model. We also use a 1-layer GRU model for the generator with 50 hidden units, and train it by fitting a Gaussian distribution with diagonal covariance to reconstruct each feature N time steps forward. We use Adam optimizer with learning rate 10 4, weight delay 10 3 and 300 epochs with early stopping. |