Reference Based LSTM for Image Captioning

Authors: Minghai Chen, Guiguang Ding, Sicheng Zhao, Hui Chen, Qiang Liu, Jungong Han

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed method on the benchmark dataset MS COCO and the results demonstrate the significant superiority over the state-of-the-art approaches.
Researcher Affiliation Academia School of Software, Tsinghua University, Beijing 100084, China Northumbria University, Newcastle, NE1 8ST, UK
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper provides a link (1https://github.com/karpathy/neuraltalk) but it refers to a public available split of the MS COCO dataset used 'as in previous works', not to the authors' own source code for their methodology.
Open Datasets Yes we carry out experiments on the popular MS COCO dataset, which contains 123,287 images labeled with at least 5 captions by different AMT workers. Since there is no standardized split on MS COCO, we use the public available split1 as in previous works ((Karpathy and Li 2015; Xu et al. 2015; You et al. 2016), etc.).
Dataset Splits Yes Since there is no standardized split on MS COCO, we use the public available split1 as in previous works ((Karpathy and Li 2015; Xu et al. 2015; You et al. 2016), etc.).
Hardware Specification No The paper does not mention any specific hardware components such as GPU or CPU models used for experiments.
Software Dependencies No The paper mentions using the VGG-16 model but does not specify any software libraries or frameworks with their version numbers (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup Yes the beam size K used in the beam search is set to 10. ... We report the results when α1 = 0.2, α2 = 0.4 in the following experiments unless otherwise specified.