reproducibilityindex.ai

Middle-Out Decoding

Authors: Shikib Mehri, Leonid Sigal

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the aforementioned models on two sequence generation tasks. First, we evaluate middleout decoders on the synthetic problem of de-noising a symmetric sequence. Next, we explore the problem of video captioning on the MSVD dataset (Chen and Dolan, 2011), evaluating our models for quality, diversity, and control.
Researcher Affiliation	Academia	Shikib Mehri Department of Computer Science University of British Columbia amehri@cs.cmu.edu Leonid Sigal Department of Computer Science University of British Columbia lsigal@cs.ubc.ca
Pseudocode	No	The paper describes the model architecture and training procedures in text and with diagrams, but it does not include formal pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement about releasing the source code or a link to a code repository for the described methodology.
Open Datasets	Yes	We utilize frame-level features provided by Pasunuru and Bansal (2017). The videos were sampled at 3fps and passed through an Inception-v4 model (Szegedy et al., 2017), pretrained on Image Net (Deng et al., 2009), to obtain 1536-dim feature vector for each frame. For this task, we utilize the MSVD (Youtube2Text) dataset (Chen and Dolan, 2011)
Dataset Splits	Yes	We use the standard splits provided by Venugopalan et al. (2015a) with 1200 training videos, 100 for validation and 670 for testing.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run its experiments.
Software Dependencies	No	The paper mentions software components like "LSTMs", "Adam optimizer", "word2vec", and "Inception-v4 model" but does not specify their version numbers, which is required for reproducible software dependencies.
Experiment Setup	Yes	We utilize 100-dimensional LSTMs and the Adam optimizer (Kingma and Ba, 2015) with a learning rate of 1e 4. We train the models for 20, 000 steps with a batch size of 32. For all of our models, we use a 1024-dimensional LSTMs, 512-dimensional embeddings... and the Adam optimizer with a learning rate of 1e 4. We utilize a batch size of 32 and train for 15 epochs. We employ a scheduled sampling training strategy (Bengio et al., 2015), which has greatly improved results in image captioning. We begin with a sampling rate of 0 and increase the sampling rate every epoch by 0.05, with a maximum sampling rate of 0.25.