Sparse Deep Learning for Time Series Data: Theory and Applications
Authors: Mingxuan Zhang, Yan Sun, Faming Liang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our numerical results show that sparse deep learning outperforms state-of-the-art methods, such as conformal predictions, in prediction uncertainty quantification for time series data. Furthermore, our results indicate that the proposed method can consistently identify the autoregressive order for time series data and outperform existing methods in large-scale model compression. |
| Researcher Affiliation | Academia | Mingxuan Zhang Department of Statistics Purdue University zhan3692@purdue.edu Yan Sun Department of Biostatistics, Epidemiology, and Informatics University of Pennsylvania yan.sun@pennmedicine.upenn.edu Faming Liang Department of Statistics fmliang@purdue.edu |
| Pseudocode | Yes | Algorithm 1 gives the prior annealing procedure[21]. |
| Open Source Code | No | The paper does not provide a direct link to its source code or explicitly state that the code for its methodology is released. |
| Open Datasets | Yes | We conduct experiments using three publicly available real-world datasets: Medical Information Mart for Intensive Care (MIMIC-III), electroencephalography (EEG) data, and daily COVID-19 case numbers within the United Kingdom s local authority districts (COVID-19) [33]. The EEG dataset, available at here, served as the primary source for the EEG signal time series. [...] The dataset can be found at here. [referring to COVID-19] [...] MIMIC-III dataset requires Physio Net credentialing. |
| Dataset Splits | Yes | Each dataset consists of training, validation, and test sequences. The training sequence has 10000 samples, while the validation and test sequences each contain 1000 samples. |
| Hardware Specification | No | The paper describes the prediction models and training settings but does not specify the hardware (e.g., GPU, CPU models) used for the experiments. |
| Software Dependencies | No | The paper mentions optimizers like SGD and Adam, but it does not specify software dependencies with version numbers (e.g., Python, PyTorch, or specific library versions). |
| Experiment Setup | Yes | For all the methods considered, we use the same underlying neural network model: an MLP with one hidden layer of size 100 and the sigmoid activation function. More details on the training process are provided in Appendix G.1. For all CP baselines, we train the MLP for 300 epochs using SGD with a constant learning rate of 0.001 and momentum of 0.9, the batch size is set to be 100. [...] For our method PA, we train a total of 300 epochs with the same learning rate, momentum, and batch size. We use T1 = 150, T2 = 160, T3 = 260. We use T1 = 150, T2 = 160, T3 = 260. We run SGD with momentum for t < T1 and SGHMC with temperature= 1 for t >= T1. For the mixture Gaussian prior, we fix σ2 1,n = 0.01, (σinit 1,n )2 = 1e 5, (σend 1,n )2 = 1e 6, and λn = 1e 7. Tables 6, 7, 9, 10, 12 and 13 also provide detailed hyperparameters. |