reproducibilityindex.ai

Diverse Image Captioning with Context-Object Split Latent Spaces

Authors: Shweta Mahajan, Stefan Roth

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our COS-CVAE approach on the standard COCO dataset and on the held-out COCO dataset consisting of images with novel objects, showing signiﬁcant gains in accuracy and diversity. 4 Experiments To show the advantages of our method for diverse and accurate image captioning, we perform experiments on the COCO dataset [29]
Researcher Affiliation	Academia	Shweta Mahajan Stefan Roth Dept. of Computer Science, TU Darmstadt {mahajan@aiphes, stefan.roth@visinf}.tu-darmstadt.de
Pseudocode	No	The paper presents architectural diagrams (Fig. 2) and descriptive text for the approach but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	1Code available at https://github.com/visinf/cos-cvae
Open Datasets	Yes	We evaluate our COS-CVAE approach on the standard COCO dataset and on the held-out COCO dataset consisting of images with novel objects, showing signiﬁcant gains in accuracy and diversity. perform experiments on the COCO dataset [29], consisting of 82 783 training and 40 504 validation images, each with ﬁve captions. We additionally perform experiments on the held-out COCO dataset [17]
Dataset Splits	Yes	Consistent with [6, 14, 44], we use 118 287 train, 4000 validation, and 1000 test images. We additionally perform experiments on the held-out COCO dataset [17] to show that our COS-CVAE framework can be extended to training on images with novel objects. This dataset is a subset of the COCO dataset and excludes all the image-text pairs containing at least one of the eight speciﬁc objects (in any one of the human annotations) in COCO: bottle , bus , couch , microwave , pizza , racket , suitcase , and zebra . The training set consists of 70 000 images. For this setting, COCO validation [29] is split into two equal halves for validation and test data.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or library versions).
Experiment Setup	No	The paper states "We consider 20 and 100 samples of z, consistent with prior work." and "More details can be found in the supplemental material." but does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs) or system-level training settings in the main text.