EViLBERT: Learning Task-Agnostic Multimodal Sense Embeddings
Authors: Agostina Calabrese, Michele Bevilacqua, Roberto Navigli
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments are organised in two main blocks. The first focuses on the evaluation of our proposed approach for the automatic verification of concept-image associations in both the concrete and non-concrete domains (Section 4.1). The second set of experiments, instead, assesses the effectiveness of our multimodal concept embeddings by evaluating them in the Word Sense Disambiguation task (Section 4.2). |
| Researcher Affiliation | Academia | Agostina Calabrese , Michele Bevilacqua and Roberto Navigli Sapienza NLP Group, Department of Computer Science, Sapienza University of Rome {calabrese.a, bevilacqua, navigli}@di.uniroma1.it |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We release code, dataset and embeddings at http://babelpic.org. |
| Open Datasets | Yes | To address this issue we start from Babel Pic [Calabrese et al., 2020], which includes manually annotated concept-image pairs. ... Our gold dataset includes 2,733 synsets and 14,931 images. ... Our silver dataset includes 42,579 synsets and 257,499 images. ... Additionally, the paper references standard datasets like ImageNet [Deng et al., 2009], COCO [Lin et al., 2014], Flickr30k Entities [Plummer et al., 2015], Open Images [Kuznetsova et al., 2020], VQA 2.0 dataset [Goyal et al., 2017], Conceptual Captions (CC), Visual Genome [Krishna et al., 2017; Anderson et al., 2018], Sem Cor corpus, and Sem Eval-2015 [Moro and Navigli, 2015]. |
| Dataset Splits | Yes | We perform the splitting of the dataset according to the 80%/10%/10% rule, hence defining training, validation and test sets. ... Validation 10.18 1.98 37.84 (Table 1) |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions software components like BERT, VLP, and Faster R-CNN, but does not specify their version numbers or other software dependencies with versions. |
| Experiment Setup | Yes | When training the VLP architecture on our gold dataset, we keep the same setting as in the original paper. That is, we set the number of both hidden layers and attention heads of the BERT encoder to 12. We train the model for 20 epochs with learning rate of 2 10 5 and a dropout rate of 0.1, selecting the weights of the best epoch, i.e. the one achieving the highest F1 score on the validation set. ... We train the system on the Sem Cor corpus for a maximum of 10 epochs, with the Adam optimizer and a learning rate of 10 4, feeding the input in batches of 250 instances. |