Fact-Driven Logical Reasoning for Machine Reading Comprehension

Authors: Siru Ouyang, Zhuosheng Zhang, Hai Zhao

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on logical reasoning benchmarks and dialogue modeling datasets show that our approach improves the baselines substantially, and it is general across backbone models.Experiments Experimental Setup We conducted the experiments on three datasets.
Researcher Affiliation Academia 1Department of Computer Science, University of Illinois Urbana-Champaign 2School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University 3Department of Computer Science and Engineering, Shanghai Jiao Tong University 4Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/ozyyshr/Focal Reasoner.
Open Datasets Yes We conducted the experiments on three datasets. Two for specialized logical reasoning ability testing: Re Clor (Yu et al. 2020) and Logi QA (Liu et al. 2020) and one for logical reasoning in dialogues: Mu Tual (Cui et al. 2020).
Dataset Splits Yes Model Re Clor Logi QA Dev Test Test-E Test-H Dev Test and Model Dev Set Test Set R4@1 R4@2 MRR R4@1 R4@2 MRR (Tables 2 and 3).
Hardware Specification Yes The model is trained for 10 epochs with a total batch size of 16 and an overall dropout rate of 0.1 on 4 NVIDIA Tesla V100 GPUs
Software Dependencies No The paper mentions 'DGL, an open-source lib of python' and 'Spa Cy (Honnibal and Montani 2017)' but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes The model is end-to-end trained and updated by Adam (Kingma and Ba 2015) optimizer with an overall learning rate of 8e-6 for Re Clor and Logi QA, and 4e-6 for Mu Tual. The weight decay is 0.01. We set the warm-up proportion during training to 0.1. Graph encoders are implemented using DGL, an open-source lib of python. The layer number of the graph encoder is 2 for Re Clor and 3 for Logi QA. The maximum sequence length is 256 for Logi QA and Mu Tual, and 384 for Re Clor. The model is trained for 10 epochs with a total batch size of 16 and an overall dropout rate of 0.1 on 4 NVIDIA Tesla V100 GPUs.