DramaQA: Character-Centered Video Story Understanding with Hierarchical QA

Authors: Seongho Choi, Kyoung-Woon On, Yu-Jung Heo, Ahjeong Seo, Youwon Jang, Minsu Lee, Byoung-Tak Zhang1166-1174

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here, we discuss an ablation study to analyze the model s characterstics profoundly. Table 2 shows the quantitative results of the ablation study for our model, and we described our experimental settings and implementation details in the Appendix C. QA Similarity is a simple baseline model designed to choose the highest score on the cosine similarity between the average of question s word embeddings and the average of candidate answer s word embeddings. The overall test accuracy of Our(Full) was 71.14% but the performance of each difficulty level varies.
Researcher Affiliation Academia Seongho Choi,1 Kyoung-Woon On,1 Yu-Jung Heo,1 Ahjeong Seo,1 Youwon Jang,1 Minsu Lee,1 Byoung-Tak Zhang1,2 1 Seoul National University 2 AI Institute (AIIS) {shchoi,kwon,yjheo,ajseo,ywjang,mslee,btzhang}@bi.snu.ac.kr
Pseudocode No The paper describes the model architecture and mathematical formulations (e.g., equations 1-4) but does not include a distinct block labeled "Pseudocode" or "Algorithm" with structured steps.
Open Source Code Yes We release our dataset and model publicly for research purposes2, and we expect our work to provide a new perspective on video story understanding research. 2https://dramaqa.snu.ac.kr
Open Datasets Yes Our dataset is built upon the TV drama Another Miss Oh 1 and it contains 17,983 QA pairs from 23,928 various length video clips... We provide 217,308 annotated images with rich charactercentered annotations... We release our dataset and model publicly for research purposes2, and we expect our work to provide a new perspective on video story understanding research. 2https://dramaqa.snu.ac.kr
Dataset Splits No The paper mentions "Table 2 shows the quantitative results of the ablation study for our model, and we described our experimental settings and implementation details in the Appendix C." While a test split is clearly indicated, the main text does not specify details about training and validation splits (e.g., percentages or counts), referring instead to an appendix not provided.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to conduct the experiments. It refers to "experimental settings and implementation details in the Appendix C" but Appendix C is not available in the provided text.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies (e.g., programming languages, libraries, frameworks) used in the experiments. It refers to "experimental settings and implementation details in the Appendix C" but Appendix C is not available in the provided text.
Experiment Setup No The paper states, "we described our experimental settings and implementation details in the Appendix C." However, these details, such as hyperparameter values, model initialization, or specific training configurations, are not present in the main text of the paper.