Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Middle-Out Decoding
Authors: Shikib Mehri, Leonid Sigal
NeurIPS 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the aforementioned models on two sequence generation tasks. First, we evaluate middleout decoders on the synthetic problem of de-noising a symmetric sequence. Next, we explore the problem of video captioning on the MSVD dataset (Chen and Dolan, 2011), evaluating our models for quality, diversity, and control. |
| Researcher Affiliation | Academia | Shikib Mehri Department of Computer Science University of British Columbia EMAIL Leonid Sigal Department of Computer Science University of British Columbia EMAIL |
| Pseudocode | No | The paper describes the model architecture and training procedures in text and with diagrams, but it does not include formal pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing the source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | We utilize frame-level features provided by Pasunuru and Bansal (2017). The videos were sampled at 3fps and passed through an Inception-v4 model (Szegedy et al., 2017), pretrained on Image Net (Deng et al., 2009), to obtain 1536-dim feature vector for each frame. For this task, we utilize the MSVD (Youtube2Text) dataset (Chen and Dolan, 2011) |
| Dataset Splits | Yes | We use the standard splits provided by Venugopalan et al. (2015a) with 1200 training videos, 100 for validation and 670 for testing. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run its experiments. |
| Software Dependencies | No | The paper mentions software components like "LSTMs", "Adam optimizer", "word2vec", and "Inception-v4 model" but does not specify their version numbers, which is required for reproducible software dependencies. |
| Experiment Setup | Yes | We utilize 100-dimensional LSTMs and the Adam optimizer (Kingma and Ba, 2015) with a learning rate of 1e 4. We train the models for 20, 000 steps with a batch size of 32. For all of our models, we use a 1024-dimensional LSTMs, 512-dimensional embeddings... and the Adam optimizer with a learning rate of 1e 4. We utilize a batch size of 32 and train for 15 epochs. We employ a scheduled sampling training strategy (Bengio et al., 2015), which has greatly improved results in image captioning. We begin with a sampling rate of 0 and increase the sampling rate every epoch by 0.05, with a maximum sampling rate of 0.25. |