Show, Interpret and Tell: Entity-Aware Contextualised Image Captioning in Wikipedia

Authors: Khanh Nguyen, Ali Furkan Biten, Andres Mafla, Lluis Gomez, Dimosthenis Karatzas

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments In our experimentation for Wikipedia Captioning, we utilize the WIT (Srinivasan et al. 2021) dataset. All the details regarding the dataset statistics, pre-processing and implementation as well as an in-depth explanation of our baselines can be found in the Appendix2. ... Wikipedia Captioning Table 1 showcases the results for Wikipedia Captioning in BLEU-4 (Papineni et al. 2002), METEOR (Banerjee and Lavie 2005), ROUGE-L (Lin 2004), CIDEr (Vedantam, Lawrence Zitnick, and Parikh 2015), and SPICE (Anderson et al. 2016), as well as precision and recall of the NE insertion as defined by (Biten et al. 2019).
Researcher Affiliation Academia Computer Vision Center, Universitat Aut onoma de Barcelona, Barcelona, Spain
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes The code, models and data splits are publicly available at https://github.com/khanhnguyen21006/wikipedia captioning
Open Datasets Yes In our experimentation for Wikipedia Captioning, we utilize the WIT (Srinivasan et al. 2021) dataset.
Dataset Splits No All the details regarding the dataset statistics, pre-processing and implementation as well as an in-depth explanation of our baselines can be found in the Appendix2.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU, CPU models) used for running the experiments.
Software Dependencies No The paper mentions various models and frameworks (e.g., ResNet-152, RoBERTa, GPT-2, T5, BERT, CLIP) but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup No The paper states 'All the details regarding the dataset statistics, pre-processing and implementation as well as an in-depth explanation of our baselines can be found in the Appendix2', indicating that specific experimental setup details like hyperparameter values are likely deferred to the appendix and are not explicitly present in the main text.