Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Inferential Knowledge-Enhanced Integrated Reasoning for Video Question Answering
Authors: Jianguo Mao, Wenbin Jiang, Hong Liu, Xiangdong Wang, Yajuan Lyu
AAAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our method achieves significant improvement on two mainstream datasets. The ablation study further demonstrates the effectiveness of each component of our approach. |
| Researcher Affiliation | Collaboration | 1 Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 2 University of Chinese Academy of Sciences, Beijing, China 3 Baidu Inc., Beijing, China |
| Pseudocode | No | The paper does not contain pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We evaluate our method on two video question answering datasets that contains character dialogues. We use standard Train/Val/Test-public splits and accuracy to measure the performance. Datasets TVQA TVQA is a widely used multi-choice video question answering dataset. Know IT VQA Know It VQA is another popular multi-choice video question answering dataset. |
| Dataset Splits | Yes | We use standard Train/Val/Test-public splits and accuracy to measure the performance. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper mentions software components like 'Res Net-101', 'BERT', 'Adam W optimizer', and 'GPT' but does not specify their version numbers for reproducibility. |
| Experiment Setup | Yes | We set batch size as 16 and use Adam W optimizer with an initial learning rate of 0.00005. About the Inferential Knowledge Reasoner, the model parameters are initialized with the pre-trained parameters from the GPT. We set batch size as 128, and use Adam W optimizer with an initial learning rate of 0.00005, and set beam search size as 5, and set the maximum decoding step as 35. |