reproducibilityindex.ai

HiPPO: Recurrent Memory with Optimal Polynomial Projections

Authors: Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, Christopher Ré

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Empirical Validation The Hi PPO dynamics are simple recurrences that can be easily incorporated into various models. We validate three claims that suggest that when incorporated into a simple RNN, these methods especially Hi PPO-Leg S yield a recurrent architecture with improved memory capability. In Section 4.1, the Hi PPO-Leg S RNN outperforms other RNN approaches in benchmark long-term dependency tasks for RNNs. Section 4.2 shows that Hi PPO-Leg S RNN is much more robust to timescale shifts compared to other RNN and neural ODE models. Section 4.3 validates the distinct theoretical advantages of the Hi PPO-Leg S memory mechanism, allowing fast and accurate online function reconstruction over millions of time steps.
Researcher Affiliation	Academia	Department of Computer Science, Stanford University Department of Computer Science and Engineering, University at Buffalo, SUNY {albertgu,trid}@stanford.edu, ermon@cs.stanford.edu, atri@buffalo.edu, chrismre@cs.stanford.edu
Pseudocode	No	The paper includes mathematical equations for continuous and discrete time dynamics and an illustration of the framework, but it does not contain explicit pseudocode or algorithm blocks labeled as such.
Open Source Code	Yes	Code for reproducing our experiments is available at https://github.com/Hazy Research/hippo-code.
Open Datasets	Yes	On the benchmark permuted MNIST dataset, our hyperparameter-free Hi PPO-Leg S method achieves a new state-of-the-art accuracy of 98.3%
Dataset Splits	No	Table 1 shows "Val. acc. (%)" for the p MNIST task, indicating a validation set was used. However, the main text does not explicitly state the specific percentages or sample counts for the training, validation, and test splits needed to reproduce the data partitioning.
Hardware Specification	No	Section 4.3 mentions that the Hi PPO-Leg S operator can perform updates "on a single CPU core," but it does not specify the model or type of CPU, nor does it provide details about other hardware components like GPUs used for training.
Software Dependencies	No	The paper states, "We implement the fast update in C++ with Pytorch binding." However, it does not provide specific version numbers for PyTorch, C++, or any other software libraries or dependencies used in the experiments.
Experiment Setup	No	The paper describes its model architecture by stating that "All methods have the same hidden size in our experiments" and that "Hi PPO variants tie the memory size N to the hidden state dimension d." It also mentions a "classiﬁcation head trained with cross-entropy." However, specific numerical values for hyperparameters such as the hidden size, learning rate, batch size, or optimizer settings are not explicitly provided in the main text; Appendix F.1 is referred to for "full architecture."