reproducibilityindex.ai

ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning

Authors: Weihao Yu, Zihang Jiang, Yanfei Dong, Jiashi Feng

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results show that state-of-the-art models have an outstanding ability to capture biases contained in the dataset with high accuracy on EASY set. However, they struggle on HARD set with poor performance near that of random guess, indicating more research is needed to essentially enhance the logical reasoning ability of current models.
Researcher Affiliation	Academia	Weihao Yu , Zihang Jiang , Yanfei Dong & Jiashi Feng National University of Singapore weihaoyu6@gmail.com, {jzihang, dyanfei}@u.nus.edu, elefjia@nus.edu.sg
Pseudocode	No	The paper describes algorithms and models, but it does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper provides a 'Project page' link (http://whyu.me/reclor/) but does not explicitly state that the source code for the described methodology is available there, nor does it provide a direct repository link.
Open Datasets	Yes	Re Clor is available for non-commercial research purpose only. We are also hosting a public evaluation server on Eval AI (Yadav et al., 2019) to benchmark progress on Reclor.
Dataset Splits	Yes	They are divided into training set, validation set and testing set with 4,638, 500 and 1,000 data points respectively.
Hardware Specification	No	The paper mentions 'computational resources' supported by a program, but it does not provide specific hardware details such as GPU or CPU models, or memory specifications.
Software Dependencies	No	The paper mentions software like 'fast Text python library', 'Bi-LSTM', 'GloVe word embedding', 'Transformers of Hugging Face', and 'Adam', but it does not provide specific version numbers for these components.
Experiment Setup	Yes	We use a batch size of 24 and ﬁne-tune for 10 epochs. The maximum input sequence length for all models is 256. The detailed hyperparameters are shown in Table 9.