reproducibilityindex.ai

Show and Tell More: Topic-Oriented Multi-Sentence Image Captioning

Authors: Yuzhao Mao, Chang Zhou, Xiaojie Wang, Ruifan Li

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on both sentence and paragraph datasets demonstrate the effectiveness of our TOMS in terms of topical consistency and descriptive completeness.
Researcher Affiliation	Academia	Yuzhao Mao, Chang Zhou, Xiaojie Wang, Ruifan Li Center for Intelligence Science and Technology, School of Computer Science, Beijing University of Posts and Telecommunications {maoyuzhao,elani,xjwang,rﬂi}@bupt.edu.cn
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	No	The paper cites external tools and frameworks (coco-caption, pytorch, neuraltalk2) but does not provide a link or statement for the source code of their proposed TOMS model.
Open Datasets	Yes	First are standard datasets, including Flickr8k [Hodosh et al., 2013], Flickr30k [Young et al., 2014] and COCO [Lin et al., 2014] for sentence level MS captioning and second is a paragraph dataset collected by [Krause et al., 2017] for paragraph level MS captioning.
Dataset Splits	Yes	The same preprocessing and data splits as previous works [Karpathy and Fei-Fei, 2015; Krause et al., 2017] are used in our experiments.
Hardware Specification	No	The paper does not explicitly describe the hardware used for experiments.
Software Dependencies	No	The paper mentions software like PyTorch, ResNet-152, coco-caption, and Stanford natural language parser, but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	We implement two layers LSTM with each hidden dimension of 512. Both topic and word embedding size are set 256. In FGU, topic and image are 512-dimensional vectors, and 1024 for context representations. Dropout is adopted in both input and output layer.