Memory-Gated Recurrent Networks

Authors: Yaquan Zhang, Qi Wu, Nanbo Peng, Min Dai, Jing Zhang, Hu Wang10956-10963

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through a combination of comprehensive simulation studies and empirical experiments on a range of public datasets, we show that our proposed m GRN architecture consistently outperforms state-of-the-art architectures targeting multivariate time series.
Researcher Affiliation Collaboration Yaquan Zhang,3 Qi Wu,2* Nanbo Peng,1 Min Dai,4 Jing Zhang,2 Hu Wang1 1 JD Digits 2 City University of Hong Kong 3 National University of Singapore, Department of Mathematics and Risk Management Institute 4National University of Singapore, Department of Mathematics, Risk Management Institute, and Chong-Qing & Suzhou Research Institutes
Pseudocode No The paper uses mathematical equations to describe the model but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes 1https://github.com/yaquanzhang/m GRN
Open Datasets Yes All data sets are publicly available. All results, except for those of m GRN, are taken from the two papers... MIMIC-III data set (Johnson et al. 2016)... UEA multivariate time series data sets (Bagnall et al. 2018).
Dataset Splits Yes The first 70, 000 observations are used to train models. The next 15, 000 observations are used for validation. The final performance is evaluated in the last 15, 000 observations.
Hardware Specification Yes Experiments are coded with Pytorch (Paszke et al. 2019) and performed on NVIDIA TITAN Xp GPUs with 12 GB memory.
Software Dependencies No Experiments are coded with Pytorch (Paszke et al. 2019), but no specific version number for Pytorch or any other software dependency is provided.
Experiment Setup Yes To make predictions, we include observations of the past 5 steps in neural networks... we limit the total number of trainable parameters to be around 1.8 thousand for all models... We also tune the learning rates via grid searches within {10 4, 5 10 4, 10 3}. and perform grid searches on hyperparameters such as variable grouping, dimensions of the marginal and joint components ( N and N), learning rates, and dropouts.