reproducibilityindex.ai

MULTISCRIPT: Multimodal Script Learning for Supporting Open Domain Everyday Tasks

Authors: Jingyuan Qi, Minqian Liu, Ying Shen, Zhiyang Xu, Lifu Huang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that our proposed approaches significantly improve over the competitive baselines. and Table 2: Automatic evaluation results on multimodal script generation and subsequent step prediction tasks.
Researcher Affiliation	Academia	Department of Computer Science, Virginia Tech
Pseudocode	No	The paper describes the methods in prose and with diagrams (Figure 2), but does not include any pseudocode or algorithm blocks.
Open Source Code	Yes	The codes, model checkpoints, and datasets are publicly available at https://github.com/VT-NLP/Multi Script.
Open Datasets	Yes	Built from Wiki How, MULTISCRIPT covers multimodal scripts in videos and text descriptions for over 6,655 human everyday tasks across 19 diverse domains. and The codes, model checkpoints, and datasets are publicly available at https://github.com/VT-NLP/Multi Script.
Dataset Splits	Yes	We split the instances created for each task into training, development, and test sets. For each task, to ensure the coverage of various domains in each set, we randomly sample 80%, 5%, and 15% articles from each domain, and use the instances created from them as the training, development, and test sets.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models used for running experiments.
Software Dependencies	No	The paper mentions software like Uni VL, OFA model, Katna, Vicuna, and Deberta, and specifies some model checkpoints (OFA-Sys/ofa-base, nli-deberta-v3-base), but does not provide specific version numbers for the underlying software libraries or tools (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup	No	The paper describes the overall framework and models used (e.g., Katna, OFA, Uni VL, Deberta, Vicuna) and how they interact, but it does not specify concrete experimental setup details such as learning rates, batch sizes, number of epochs, or optimizer configurations.