Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Variational Smoothing in Recurrent Neural Network Language Models
Authors: Lingpeng Kong, Gabor Melis, Wang Ling, Lei Yu, Dani Yogatama
ICLR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically verify our analysis on two benchmark language modeling datasets and demonstrate performance improvements over existing data noising methods. |
| Researcher Affiliation | Industry | Lingpeng Kong, Gabor Melis, Wang Ling, Lei Yu, Dani Yogatama Deep Mind EMAIL |
| Pseudocode | No | The paper does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any link or statement about making its source code publicly available. |
| Open Datasets | Yes | We evaluate our approaches on two standard language modeling datasets: Penn Treebank (Marcus et al., 1994) and Wikitext-2 (Merity et al., 2017). |
| Dataset Splits | No | The paper mentions using a 'development set' and 'test set' but does not specify the exact percentages or sample counts for training, validation, and test splits, nor does it cite predefined splits with specific details. |
| Hardware Specification | No | The paper does not specify the exact hardware (e.g., GPU/CPU models, memory, or specific computing infrastructure) used to run the experiments. |
| Software Dependencies | No | The paper mentions software components like LSTM and RMSprop but does not provide specific version numbers for these or other dependencies required for reproducibility. |
| Experiment Setup | Yes | We tune the RMSprop learning rate and ℓ2 regularization hyperparameter λ for all models on a development set by a grid search on {0.002, 0.003, 0.004} and {10 4, 10 3} respectively, and use perplexity on the development set to choose the best model. We also tune γ from {0.1, 0.2, 0.3, 0.4}. We use recurrent dropout (Semeniuta et al., 2016) for R and set it to 0.2, and apply (element-wise) input and output embedding dropouts for E and O and set it to 0.5 when E, O RV 512 and 0.7 when E, O RV 1024 based on preliminary experiments. We tie the input and output embedding matrices in all our experiments (i.e., E = O), except for the vanilla LSTM model, where we report results for both tied and untied. |