Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations
Authors: David Krueger, Tegan Maharaj, Janos Kramar, Mohammad Pezeshki, Nicolas Ballas, Nan Rosemary Ke, Anirudh Goyal, Yoshua Bengio, Aaron Courville, Christopher Pal
ICLR 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform an empirical investigation of various RNN regularizers, and find that zoneout gives significant performance improvements across tasks. We achieve competitive results with relatively simple models in characterand word-level language modelling on the Penn Treebank and Text8 datasets, and combining with recurrent batch normalization (Cooijmans et al., 2016) yields state-of-the-art results on permuted sequential MNIST. |
| Researcher Affiliation | Academia | 1 MILA, Université de Montréal, EMAIL. 2 École Polytechnique de Montréal, EMAIL. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code for replicating all experiments can be found at: http://github.com/teganmaharaj/zoneout |
| Open Datasets | Yes | We evaluate zoneout s performance on the following tasks: (1) Character-level language modelling on the Penn Treebank corpus (Marcus et al., 1993); (3) Character-level language modelling on the Text8 corpus (Mahoney, 2011); (4) Classification of hand-written digits on permuted sequential MNIST (p MNIST) (Le et al., 2015). |
| Dataset Splits | No | The paper mentions "Validation BPC" and "Validation error rates" and uses a validation set for metrics, but does not explicitly provide the specific size, percentage, or methodology for the train/validation split. |
| Hardware Specification | No | The paper mentions "computing resources provided by Compute Canada and Calcul Quebec" but does not specify exact hardware details such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions "Theano (Theano Development Team, 2016), Fuel, and Blocks (van Merriënboer et al., 2015)" but does not specify exact version numbers for these software dependencies. |
| Experiment Setup | Yes | For the character-level task, we train networks with one layer of 1000 hidden units. We train LSTMs with a learning rate of 0.002 on overlapping sequences of 100 in batches of 32, optimize using Adam, and clip gradients with threshold 1. ... For the word-level task... 2 layers of 1500 units, with weights initialized uniformly [-0.04, +0.04]. The model is trained for 14 epochs with learning rate 1, after which the learning rate is reduced by a factor of 1.15 after each epoch. Gradient norms are clipped at 10. and "All models have a single layer of 100 units, and are trained for 150 epochs using RMSProp (Tieleman & Hinton, 2012) with a decay rate of 0.5 for the moving average of gradient norms. The learning rate is set to 0.001 and the gradients are clipped to a maximum norm of 1 (Pascanu et al., 2012)." |