Keep Skills in Mind: Understanding and Implementing Skills in Commonsense Question Answering
Authors: Meikai Bao, Qi Liu, Kai Zhang, Ye Liu, Linan Yue, Longfei Li, Jun Zhou
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on two publicly available CQA datasets show the effectiveness of our proposed model and the considerable impact of introducing skills. |
| Researcher Affiliation | Collaboration | Meikai Bao1,2 , Qi Liu1,2, , Kai Zhang1,2 , Ye Liu1,2 , Linan Yue1,2 , Longfei Li3 , Jun Zhou3 1Anhui Province Key Laboratory of Big Data Analysis and Application, University of Science and Technology of China 2State Key Laboratory of Cognitive Intelligence 3Ant Financial Services Group |
| Pseudocode | No | The paper describes the model's architecture and processes through textual descriptions and diagrams (Figure 2), but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is at https://github.com/BAOOOOOM/DSCQA. |
| Open Datasets | Yes | We use two widely-used commonsense datasets, i.e., Commonsense QA (CSQA [Talmor et al., 2019]) and Commonsense QA 2.0 (CSQA2 [Talmor et al., 2021]), as benchmarks. |
| Dataset Splits | Yes | Table 2: Skills and their frequency in datasets (an example may involve more than one skill). Dataset CSQA2 CSQA skill train dev test train dev causality 5.71% 5.63% 6.47% 4.39% 3.85%... |
| Hardware Specification | No | The paper mentions using 'T5-large' as the backbone model but does not specify the hardware (e.g., specific GPU or CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions using Adam W as the optimizer, fine-tuned Sentence-T5 as the context encoder, and Open Prompt as a framework, but does not provide specific version numbers for these software dependencies or underlying libraries like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | We use Adam W [Loshchilov and Hutter, 2019] as the optimizer and set the learning rate to 1e-5. We set the maximum length of the model input to 64. For general prefixes, the prefix length is set to 100, and its dropout rate is set to 0.5. The number of attention heads is set to 12 for question-skill attention and 8 for skill-question attention. |