reproducibilityindex.ai

Feature Deformation Meta-Networks in Image Captioning of Novel Objects

Authors: Tingjia Cao, Ke Han, Xiaomei Wang, Lin Ma, Yanwei Fu, Yu-Gang Jiang, Xiangyang Xue10494-10501

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted on the widely used novel object captioning dataset, and the results show the effectiveness of our FDM-net. Ablation study and qualitative visualization further give insights of our model.
Researcher Affiliation	Collaboration	1Shanghai Key Lab of Intelligent Information Processing, School of Computer Science Fudan University 2School of Data Science, and MOE Frontiers Center for Brain Science, Fudan University, 3Tencent AI Lab
Pseudocode	No	The paper describes the steps of the method in paragraph text but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets	Yes	We follow the novel object captioning split (NOC split) introduced by (Anne Hendricks et al. 2016) to evaluate our proposed method. It comes from the standard split of MSCOCO 2014 (Chen et al. 2015) that contains 120K images, and each image is labelled with ﬁve human-annotated sentences. ... To evaluate the expandability of our method, we also conduct experiments on Open Image dataset a large-scale dataset.
Dataset Splits	Yes	In the standard validation dataset, half of the pairs are randomly selected into new validation set, and others are selected into the test set.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	In our model, we use the pre-trained bottom-up attention model to extract visual features. To make a fair comparison, we use traditional cross-entropy loss during training. The Ro I features of novel objects come from Open Image. Specially, we use mis-labelled probability strategy (MLS) to select top similar seen objects for each unseen object. As shown in Tab. 1, the top three similar seen objects are considered to conduct the replacing work with their corresponding novel objects. That means we set k = 3. Besides, constrained beam search (CBS) algorithm (Koehn 2016) is also applied in the test and validation stage. For ensuring the diversity of our augmented dataset, we extract 100 novel object features as resources for the following replacement.