reproducibilityindex.ai

Question DifÞculty Prediction for READING Problems in Standard Tests

Authors: Zhenya Huang, Qi Liu, Enhong Chen, Hongke Zhao, Mingyong Gao, Si Wei, Yu Su, Guoping Hu

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on a real-world dataset not only show the effectiveness of TACNN, but also give interpretable insights to track the attention information for questions.
Researcher Affiliation	Collaboration	School of Computer Science and Technology, University of Science and Technology of China {huangzhy, zhhk}@mail.ustc.edu.cn, {qiliuql, cheneh}@ustc.edu.cn i FLYTEK Research, {mygao2, siwei, gphu}@iﬂytek.com School of Computer Science and Technology, Anhui University, yusu@iﬂytek.com
Pseudocode	No	The paper describes the model architecture and mathematical formulations, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements about providing open-source code for the described methodology, nor does it include links to a code repository.
Open Datasets	No	The experimental dataset supplied by IFLYTEK is collected from real-world standard tests for READING problems, which contains nearly 3 million test logs of thousands of Chinese senior high schools from the year 2014 to 2016.
Dataset Splits	No	To observe how the models behave at different data sparsity, we randomly select 60%, 40%, 20%, 10% of standard tests as testing sets, and the rests as training sets, respectively. (No explicit mention of validation set splits.)
Hardware Specification	Yes	Both TACNN and baselines are all implemented by Theano (Bergstra et al. 2010) and all experiments are run on a Tesla K20m GPU.
Software Dependencies	No	The paper mentions software like Theano and word2vec but does not provide specific version numbers for these or other software dependencies required for replication.
Experiment Setup	Yes	In TACNN, we set the maximum length M (N) of sentences (words) in documents (sentences) as 25 (40) (zero padded when necessary) according to our observation in Figure 5, i.e., 95% documents (sentences) contains less than 25 (40) sentences (words). Four layers of convolution (three wide convolutions, one narrow convolution) and max-pooling are employed for the Sentence CNN Layer to accommodate the sentence length N, where the numbers of the feature maps for four convolutions are (200, 400, 600, 600) respectively. Also, we set the kernel size k as 3 for all convolution layers and the pooling window p as (3, 3, 2, 1) for each max pooling, respectively. ... we set mini batches as 32 for training and we also use dropout (with probability 0.2) in order to prevent overﬁtting.