reproducibilityindex.ai

Automatic Assessment of Absolute Sentence Complexity

Authors: Sanja Stajner, Simone Paolo Ponzetto, Heiner Stuckenschmidt

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform three sets of experiments (Sections 5.1 5.3) to evaluate our approach.
Researcher Affiliation	Academia	Sanja ˇStajner, Simone Paolo Ponzetto and Heiner Stuckenschmidt Data and Web Science Group, University of Mannheim, Germany {sanja,simone,heiner}@informatik.uni-mannheim.de
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	Yes	Data and code available at: http://web.informatik.unimannheim.de/sstajner/publications.
Open Datasets	Yes	We use the English part of the Newsela corpora6 to learn lexical properties on different text complexity levels (the lists of unigrams, bigrams and trigrams occurring at each level, and their relative frequencies). [Footnote 6: https://newsela.com/data/]. Also, "Data and code available at: http://web.informatik.unimannheim.de/sstajner/publications." for their gold standard dataset.
Dataset Splits	Yes	We used ﬁve different classiﬁers: Logistic [le Cessie and van Houwelingen, 1992], SMOs Weka implementation of SVM [Platt, 1998] with feature standardisation, JRip rule learner [Cohen, 1995], J48 Weka implementation of C4.5 decision tree [Quinlan, 1993], and Random Forest [Breiman, 2001], in a 10-fold cross-validation setup with 10 repetitions in Weka Experimenter [Hall et al., 2009].
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper mentions 'Weka Experimenter [Hall et al., 2009]' and various classifiers, but does not provide specific version numbers for these software components.
Experiment Setup	No	The paper mentions using Random Forest and a 10-fold cross-validation setup, but does not provide specific hyperparameter values or detailed training configurations for the models.