reproducibilityindex.ai

MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding

Authors: Revant Gangi Reddy, Xilin Rui, Manling Li, Xudong Lin, Haoyang Wen, Jaemin Cho, Lifu Huang, Mohit Bansal, Avirup Sil, Shih-Fu Chang, Alexander Schwing, Heng Ji11200-11208

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate both pipeline-based and end-to-end pretraining-based multimedia QA models on our benchmark, and show that they achieve promising performance, while considerably lagging behind human performance hence leaving large room for future work on this challenging new task.
Researcher Affiliation	Collaboration	Revanth Gangi Reddy1, Xilin Rui2, Manling Li1, Xudong Lin3, Haoyang Wen1, Jaemin Cho4, Lifu Huang5, Mohit Bansal4, Avirup Sil6, Shih-Fu Chang3, Alexander Schwing1, Heng Ji1 1 University of Illinois Urbana-Champaign 2 Tsinghua University 3 Columbia University 4 University of North Carolina at Chapel Hill 5 Virginia Tech 6 IBM Research AI
Pseudocode	No	The paper describes algorithms and pipelines in text and diagrams but does not provide formal pseudocode blocks.
Open Source Code	Yes	All the datasets, programs and tools will be made publicly available here: https://github.com/uiucnlp/Mu Mu QA
Open Datasets	Yes	All the datasets, programs and tools will be made publicly available here: https://github.com/uiucnlp/Mu Mu QA
Dataset Splits	Yes	The evaluation benchmark contains 1384 humanannotated instances, with 263 instances in the development set and the remaining in the test set.
Hardware Specification	No	The paper does not specify any hardware details like GPU models, CPU types, or cloud instance specifications used for running the experiments.
Software Dependencies	No	The paper mentions several software components like BART, OSCAR, BERT, Faster-RCNN, and Fast Text, but it does not provide specific version numbers for these dependencies.
Experiment Setup	No	The paper mentions training models on certain datasets and using pre-trained models (e.g., 'We start with the pre-trained model provided by Li et al. (2020c)'), but it does not specify concrete hyperparameters like learning rates, batch sizes, or the number of epochs for training the models.