Temporal Dependencies in Feature Importance for Time Series Prediction

Authors: Kin Kwan Leung, Clayton Rooke, Jonathan Smith, Saba Zuberi, Maksims Volkovs

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct an extensive empirical study on synthetic and real-world data, compare against a wide range of leading explainability methods, and explore the impact of various evaluation strategies. Our results show that Win IT achieves significant gains over existing methods, with more consistent performance across different evaluation metrics.
Researcher Affiliation Collaboration Kin Kwan Leung Layer 6 AI Clayton Rooke Univ. Waterloo Jonathan Smith Meta Saba Zuberi Layer 6 AI Maksims Volkovs Layer 6 AI
Pseudocode Yes Algorithm 1 Win IT
Open Source Code Yes The code for our work is publicly available at https://github.com/layer6ai-labs/Win IT, which includes the detailed settings for experiments.
Open Datasets Yes MIMIC-III is a multivariate clinical time series dataset with a range of vital and lab measurements taken over time for around 40,000 patients at the Beth Israel Deaconess Medical Center in Boston, MA (Johnson et al., 2016). Dataset Spike is a benchmark experiment presented in Tonekaboni et al. (2020).
Dataset Splits Yes All evaluations are conducted over 5-fold cross-validation and averaged. We measure stability by splitting the training set into 5 folds and report results averaged across the folds with corresponding standard deviation error bars.
Hardware Specification Yes All experiments were performed with 40 Intel Xeon CPU@2.20GHz cores and Nvidia Titan V GPU.
Software Dependencies No The paper mentions software components like GRU models, Adam optimizer, and the Captum library, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes For all experiments we use 1-layer GRU models for f with 200 hidden units. We use Adam optimizer with learning rate 10 3 (10 4 for MIMIC-III) and weight decay 10 3 to train the model. We also use a 1-layer GRU model for the generator with 50 hidden units, and train it by fitting a Gaussian distribution with diagonal covariance to reconstruct each feature N time steps forward. We use Adam optimizer with learning rate 10 4, weight delay 10 3 and 300 epochs with early stopping.