reproducibilityindex.ai

Learning to Predict Readability Using Eye-Movement Data From Natives and Learners

Authors: Ana González-Garduño, Anders Søgaard

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our models on both Simple Wikipedia vs Normal Wikipedia (Coster and Kauchak 2011) and One Stop English (Vajjala and Meurers 2014) corpus. The results for the task of readability prediction for all systems are shown in Table 2. For all datasets, multi-task systems using the GECO dataset gave the best results. Improvements over single-task baselines were in the range of .67 3.35%.
Researcher Affiliation	Academia	Ana V. Gonz alez-Gardu no, Anders Søgaard Department of Computer Science, University of Copenhagen {ana, soegaard}@di.ku.dk
Pseudocode	No	No pseudocode or clearly labeled algorithm block was found in the paper.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository.
Open Datasets	Yes	The ﬁrst is a sentence aligned corpus of 137,000 simple versus normal English sentences (Coster and Kauchak 2011), which was made in order to assess the performance of simpliﬁcation systems. The second corpus is the sentence-level One Stop English Corpus (Vajjala and Meurers 2014) which consists of sentences at three different levels: Elementary, Intermediate and Advanced. For the auxiliary task of predicting eye movements, we use the Dundee Corpus (Kennedy and Pynte 2003), which has been used in readability studies and studies of syntactic complexity (Singh et al. 2016; Demberg and Keller 2008). In addition, we use the GECO corpus (Cop et al. 2017), which consists of data from 14 monolingual native English speakers, and 19 native speakers of Dutch with English as their second language.
Dataset Splits	Yes	For all experiments, we use 60% of the data for training, 20% for development and 20% for testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies	No	The paper mentions using 'Adam optimizer' and 'Re Lu activation' and a 'probabilistic top-down parser (Roark 2001)' but does not provide specific software names with version numbers (e.g., 'Python 3.8', 'PyTorch 1.9') for full reproducibility.
Experiment Setup	Yes	For the single-task system, we use a three-layer perceptron with sigmoid activation at the output layer for readability prediction and linear activation for eye movement prediction models. We use Re Lu activation in the hidden layers which contains 100 neurons. All models use Adam optimizer and a drop-out rate of 0.5.