Input Switched Affine Networks: An RNN Architecture Designed for Interpretability

Authors: Jakob N. Foerster, Justin Gilmer, Jascha Sohl-Dickstein, Jan Chorowski, David Sussillo

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We trained RNNs on the Text8 Wikipedia dataset and the billion word benchmark (BWB), for one-step-ahead character prediction. The results on Text8 are shown in Table 1.
Researcher Affiliation Collaboration 1This work was performed as an intern at Google Brain 2Work done as a member of the Google Brain Residency program (g.co/brainresidency) 3Google Brain, Mountain View, CA, USA 4Work performed when author was a visiting faculty at Google Brain. Correspondence to: Jakob N. Foerster <jakob.foerster@cs.ox.ac.uk>, David Sussillo <sussillo@google.com>.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access to source code for the methodology described.
Open Datasets Yes We trained RNNs on the Text8 Wikipedia dataset... The Text8 dataset consists only of the 27 characters a z and _ (space)... (Mahoney, 2011). We trained RNNs on... the billion word benchmark (BWB)... (Chelba et al., 2013).
Dataset Splits Yes For the Text8 dataset, we split the data into 90%, 5%, and 5% for train, validation, and test respectively, in line with (Mikolov et al., 2012).
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers).
Experiment Setup No The paper states 'The network was trained with the same hyperparameter tuning infrastructure as in (Collins et al., 2016)' and 'Due to long experiment running times, we manually tuned the hyperparameters,' but it does not explicitly list the specific hyperparameter values or other concrete experimental setup details.