ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning

Authors: Weihao Yu, Zihang Jiang, Yanfei Dong, Jiashi Feng

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results show that state-of-the-art models have an outstanding ability to capture biases contained in the dataset with high accuracy on EASY set. However, they struggle on HARD set with poor performance near that of random guess, indicating more research is needed to essentially enhance the logical reasoning ability of current models.
Researcher Affiliation Academia Weihao Yu , Zihang Jiang , Yanfei Dong & Jiashi Feng National University of Singapore weihaoyu6@gmail.com, {jzihang, dyanfei}@u.nus.edu, elefjia@nus.edu.sg
Pseudocode No The paper describes algorithms and models, but it does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper provides a 'Project page' link (http://whyu.me/reclor/) but does not explicitly state that the source code for the described methodology is available there, nor does it provide a direct repository link.
Open Datasets Yes Re Clor is available for non-commercial research purpose only. We are also hosting a public evaluation server on Eval AI (Yadav et al., 2019) to benchmark progress on Reclor.
Dataset Splits Yes They are divided into training set, validation set and testing set with 4,638, 500 and 1,000 data points respectively.
Hardware Specification No The paper mentions 'computational resources' supported by a program, but it does not provide specific hardware details such as GPU or CPU models, or memory specifications.
Software Dependencies No The paper mentions software like 'fast Text python library', 'Bi-LSTM', 'GloVe word embedding', 'Transformers of Hugging Face', and 'Adam', but it does not provide specific version numbers for these components.
Experiment Setup Yes We use a batch size of 24 and fine-tune for 10 epochs. The maximum input sequence length for all models is 256. The detailed hyperparameters are shown in Table 9.