reproducibilityindex.ai

Argument Mining Driven Analysis of Peer-Reviews

Authors: Michael Fromm, Evgeniy Faerman, Max Berrendorf, Siddharth Bhargava, Ruoxia Qi, Yao Zhang, Lukas Dennert, Sophia Selle, Yang Mao, Thomas Seidl4758-4766

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our extensive empirical evaluation, we show that Argument Mining can be used to efﬁciently extract the most relevant parts from reviews, which are paramount for the publication decision.
Researcher Affiliation	Academia	Michael Fromm,1 Evgeniy Faerman,1 Max Berrendorf,1 Siddharth Bhargava,2 Ruoxia Qi,2 Yao Zhang,2 Lukas Dennert,2 Sophia Selle,2 Yang Mao2 and Thomas Seidl1 1 Database Systems and Data Mining, LMU Munich, Germany 2 LMU Munich, Germany
Pseudocode	No	The paper describes models and training procedures but does not include any pseudocode or algorithm blocks.
Open Source Code	Yes	The annotated dataset 5 and the code 6 is available. Footnote 6: https://github.com/fromm-m/aaai2021-am-peer-reviews
Open Datasets	Yes	The annotated dataset 5 and the code 6 is available. Footnote 5: https://zenodo.org/record/4314390
Dataset Splits	Yes	We split our dataset sentence-wise 7:1:2 into training, validation and test sets stratiﬁed by class, i.e. keeping the same ratio among classes in all three subsets.
Hardware Specification	No	The paper states 'The infrastructure for the course was provided by the Leibniz Rechenzentrum,' but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments.
Software Dependencies	No	The paper mentions software components such as 'Adam W optimizer', 'bert-base-cased', 'bert-large-cased', and 'Punkt Sentence Tokenizer from NLTK' but does not provide explicit version numbers for these software dependencies.
Experiment Setup	Yes	The models are trained using either bert-base-cased or bert-large-cased, with training batch size 100 for bert-base and 32 for bert-large. We use the Adam W optimizer with a learning rate of 10 5 for all models and early stopping with a patience of 3.