Input Switched Affine Networks: An RNN Architecture Designed for Interpretability
Authors: Jakob N. Foerster, Justin Gilmer, Jascha Sohl-Dickstein, Jan Chorowski, David Sussillo
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We trained RNNs on the Text8 Wikipedia dataset and the billion word benchmark (BWB), for one-step-ahead character prediction. The results on Text8 are shown in Table 1. |
| Researcher Affiliation | Collaboration | 1This work was performed as an intern at Google Brain 2Work done as a member of the Google Brain Residency program (g.co/brainresidency) 3Google Brain, Mountain View, CA, USA 4Work performed when author was a visiting faculty at Google Brain. Correspondence to: Jakob N. Foerster <jakob.foerster@cs.ox.ac.uk>, David Sussillo <sussillo@google.com>. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | Yes | We trained RNNs on the Text8 Wikipedia dataset... The Text8 dataset consists only of the 27 characters a z and _ (space)... (Mahoney, 2011). We trained RNNs on... the billion word benchmark (BWB)... (Chelba et al., 2013). |
| Dataset Splits | Yes | For the Text8 dataset, we split the data into 90%, 5%, and 5% for train, validation, and test respectively, in line with (Mikolov et al., 2012). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers). |
| Experiment Setup | No | The paper states 'The network was trained with the same hyperparameter tuning infrastructure as in (Collins et al., 2016)' and 'Due to long experiment running times, we manually tuned the hyperparameters,' but it does not explicitly list the specific hyperparameter values or other concrete experimental setup details. |