reproducibilityindex.ai

MC-LSTM: Mass-Conserving LSTM

Authors: Pieter-Jan Hoedt, Frederik Kratzert, Daniel Klotz, Christina Halmich, Markus Holzleitner, Grey S Nearing, Sepp Hochreiter, Guenter Klambauer

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	MC-LSTMs set a new state-of-the-art for neural arithmetic units at learning arithmetic operations, such as addition tasks, which have a strong conservation law, as the sum is constant over time. Further, MC-LSTM is applied to trafﬁc forecasting, modeling a damped pendulum, and a large benchmark dataset in hydrology, where it sets a new state-of-the-art for predicting peak ﬂows. 5. Experiments In the following, we demonstrate the broad applicability and high predictive performance of MC-LSTM in settings where mass conservation is required.
Researcher Affiliation	Collaboration	1ELLIS Unit Linz, LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria 2Google Research, Mountain View, CA, USA 3Institute of Advanced Research in Artiﬁcial Intelligence (IARAI).
Pseudocode	No	The paper describes the architecture using mathematical equations and a schematic diagram (Figure 1) but does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code	Yes	Code for the experiments can be found at https:// github.com/ml-jku/mc-lstm
Open Datasets	Yes	An ensemble of 10 MC-LSTMs was trained on 10 years of data from 447 basins using the publicly-available CAMELS dataset (Newman et al., 2015; Addor et al., 2017).
Dataset Splits	Yes	For training, we sampled sequences of length 100 with two random numbers (between 0 and 0.5) that had to be summed up. For testing, we used 100 000 sequences that were not used during training. (Appendix B.1.1) The training and validation sets were constructed by randomly sampling 10 000 input sequences of length between 1 and 20. (Appendix B.1.2) We used the same split into calibration and validation periods as in (Kratzert et al., 2019b): The period 1980–1999 for calibration and 2000–2009 for validation. (Appendix B.4.2)
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, memory, or detailed computer specifications used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment.
Experiment Setup	Yes	For training, we sampled sequences of length 100 with two random numbers (between 0 and 0.5) that had to be summed up. We used batch sizes of 128. We trained for 50 epochs using the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 0.001. The LSTM had 20 hidden units and was initialized using orthogonal initialization (Saxe et al., 2013). We used a high initial forget gate bias of 1.0 (Gers & Schmidhuber, 2000). (Appendix B.1.1)