DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models

Authors: Ge Zheng, Bin Yang, Jiajin Tang, Hong-Yu Zhou, Sibei Yang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The rationales generated by DDCo T not only improve the reasoning abilities of both large and small language models in zero-shot prompting and fine-tuning learning, significantly outperforming state-of-the-art methods but also exhibit impressive generalizability and explainability.
Researcher Affiliation Academia 1Shanghai Tech University 2The University of Hong Kong
Pseudocode No The paper describes the steps of the DDCo T prompting but does not provide a formal pseudocode or algorithm block.
Open Source Code No The paper mentions a "Project Page: https://toneyaya.github.io/ddcot/" but does not explicitly state that the source code for the methodology is available there, nor does it provide a direct link to a code repository.
Open Datasets Yes Science QA benchmark [31] is the first multimodal science question-answer dataset comprising 21,000 questions with multiple choices and images.
Dataset Splits Yes Following previous works [71, 31], we divide Science QA into training, validation, and test sets, which contain 12,726, 4,241, and 4,241 examples, respectively.
Hardware Specification Yes All experiments are implemented by Py Torch [39] and Hugging Face [61] and conducted on NVIDIA Tesla A40 GPUs.
Software Dependencies No All experiments are implemented by Py Torch [39] and Hugging Face [61] and conducted on NVIDIA Tesla A40 GPUs. Specific version numbers for PyTorch or Hugging Face are not provided.
Experiment Setup Yes We train our model for 30 epochs with a learning rate of 1e-4 and batch size of 16.