reproducibilityindex.ai

Do RNN and LSTM have Long Memory?

Authors: Jingyu Zhao, Feiqing Huang, Jia Lv, Yanjie Duan, Zhen Qin, Guodong Li, Guangjian Tian

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This section reports several numerical experiments. We ﬁrst compare the models using time series forecasting tasks on four long memory datasets and one short memory dataset. Then, we investigate the effect of the model parameter K on the forecasting performance. Lastly, we apply the proposed models to two sentiment analysis tasks.
Researcher Affiliation	Collaboration	1Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong, China 2Huawei Noah s Ark Lab, Hong Kong, China.
Pseudocode	No	Explanation: The paper provides mathematical equations and descriptions of the model structures (e.g., in Section 3.1 for MRNN) but does not include formal pseudocode or an algorithm block.
Open Source Code	Yes	Our implementation in Py Torch is available at https: //github.com/huawei-noah/noah-research/ tree/master/m RNN-m LSTM.
Open Datasets	Yes	Metro interstate trafﬁc volume The raw dataset contains hourly Interstate 94 Westbound trafﬁc volume for MN Do T ATR station 301, roughly midway between Minneapolis and St Paul, MN, obtained from MN Department of Transportation (UCI). We convert it to de-seasoned daily data with length 1860 (1400 + 200 + 259). UCI. UCI Machine Learning Repository metro interstate trafﬁc volume data set. https: //archive.ics.uci.edu/ml/datasets/ Metro+Interstate+Traffic+Volume, 2019. Accessed: 2019-12-28.
Dataset Splits	Yes	We split the datasets into training, validation and test sets, and report their lengths below using notation (ntrain + nval + ntest). MSE is the target loss function for training. We perform one-step rolling forecasts and calculate test RMSE, MAE, and MAPE. ARFIMA series We generated a series of length 4001 (2000+1200+800) using the model (1 0.7B+0.4B2)(1 B)0.4Yt = (1 0.2B)εt with obvious long memory effect.
Hardware Specification	No	Explanation: The paper does not provide specific details about the hardware used, such as exact GPU or CPU models, memory specifications, or processor types.
Software Dependencies	No	All the networks are implemented in Py Torch.
Experiment Setup	Yes	We use the Adam algorithm with learning rate 0.01 for optimization. The optimization is stopped when the loss function drops by less than 10 5 or has been increasing for 100 steps or has reached 1000 steps in total. The learned model is chosen to be the one with the smallest loss on the validation set.