Predictive State Recurrent Neural Networks

Authors: Carlton Downey, Ahmed Hefny, Byron Boots, Geoffrey J. Gordon, Boyue Li

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply PSRNNs to 4 datasets, and show that we outperform several popular alternative approaches to modeling dynamical systems in all cases.
Researcher Affiliation Academia Carlton Downey Carnegie Mellon University Pittsburgh, PA 15213 cmdowney@cs.cmu.edu Ahmed Hefny Carnegie Mellon University Pittsburgh, PA, 15213 ahefny@cs.cmu.edu Boyue Li Carnegie Mellon University Pittsburgh, PA, 15213 boyue@cs.cmu.edu Byron Boots Georgia Tech Atlanta, GA, 30332 bboots@cc.gatech.edu Geoff Gordon Carnegie Mellon University Pittsburgh, PA, 15213 ggordon@cs.cmu.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper mentions 'e.g., a Py Torch implementation of this architecture for text prediction can be found at https://github.com/pytorch/examples/tree/master/word_language_model.', but this refers to a general PyTorch example, not the authors' specific open-source code for their PSRNN methodology.
Open Datasets Yes Penn Tree Bank (PTB) This is a standard benchmark in the NLP community [36]. Handwriting This is a digit database available on the UCI repository [37, 38] created using a pressure sensitive tablet and a cordless stylus. Swimmer We consider the 3-link simulated swimmer robot from the open-source package Open AI gym.3
Dataset Splits No The paper specifies 'train/test split' for all datasets (e.g., 'Penn Tree Bank (PTB) ... train/test split of 120780/124774 characters.') but does not explicitly mention a validation split or its size.
Hardware Specification No The paper mentions 'Due to hardware limitations' but does not provide specific details about the hardware used (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies No The paper mentions 'Py Torch or Tensor Flow' as neural network libraries but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes In two-stage regression we use a ridge parameter of 10( 2)n where n is the number of training examples... We use a horizon of 1 in the PTB experiments, and a horizon of 10 in all continuous experiments. We use 2000 RFFs from a Gaussian kernel... We use 20 hidden states, and a fixed learning rate of 1 in all experiments. We use a BPTT horizon of 35 in the PTB experiments, and an infinite BPTT horizon in all other experiments.