reproducibilityindex.ai

Review Networks for Caption Generation

Authors: Zhilin Yang, Ye Yuan, Yuexin Wu, William W. Cohen, Russ R. Salakhutdinov

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we show that our framework improves over state-of-the-art encoder-decoder systems on the tasks of image captioning and source code captioning.
Researcher Affiliation	Academia	Zhilin Yang, Ye Yuan, Yuexin Wu, Ruslan Salakhutdinov, William W. Cohen School of Computer Science Carnegie Mellon University {zhiliny,yey1,yuexinw,rsalakhu,wcohen}@cs.cmu.edu
Pseudocode	No	The paper describes the model architecture and mathematical formulations through text and equations (e.g., Eq. 1, 2, 3) and provides architectural diagrams (Figure 1, Figure 2), but it does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code and data available at https://github.com/kimiyoung/review_net.
Open Datasets	Yes	We evaluate our model on the MSCOCO benchmark dataset [2] for image captioning. The dataset contains 123,000 images with at least 5 captions for each image. For ofﬂine evaluation, we use the same data split as in [7, 20, 21]... We experiment with a benchmark dataset for source code captioning, Habeas Corpus [11]. Habeas Corpus collects nine popular open-source Java code repositories...
Dataset Splits	Yes	For ofﬂine evaluation, we use the same data split as in [7, 20, 21], where we reserve 5,000 images for development and test respectively and use the rest for training. ... We randomly sample 10% of the ﬁles as the test set, 10% as the development set, and use the rest for training.
Hardware Specification	Yes	Unlike these methods, our approach with the review network is a generic end-to-end encoder-decoder model and can be trained within six hours on a Titan X GPU.
Software Dependencies	No	The paper mentions specific CNN architectures like VGGNet [13] and Inception-v3 [16], and uses LSTM units. However, it does not provide specific version numbers for general software dependencies or libraries (e.g., 'TensorFlow 1.x' or 'Python 3.x').
Experiment Setup	Yes	We set the number of review steps Tr = 8, the weighting factor λ = 10.0, the dimension of word embeddings to be 100, the learning rate to be 1e 2, and the dimension of LSTM hidden states to be 1, 024. These hyperparameters are tuned on the development set. ... We set the number of review steps Tr = 8, the dimension of word embeddings to be 50, and the dimension of the LSTM hidden states to be 256.