reproducibilityindex.ai

Challenges in the Automatic Analysis of Students’ Diagnostic Reasoning

Authors: Claudia Schulz, Christian M. Meyer, Iryna Gurevych6974-6981

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We create the ﬁrst corpus for this task, comprising diagnostic reasoning selfexplanations of students from two domains annotated with epistemic activities. Based on insights from the corpus creation and the task s characteristics, we discuss three challenges for the automatic identiﬁcation of epistemic activities using AI methods: the correct identiﬁcation of epistemic activity spans, the reliable distinction of similar epistemic activities, and the detection of overlapping epistemic activities. We propose a separate performance metric for each challenge and thus provide an evaluation framework for future research. Indeed, our evaluation of various state-of-the-art recurrent neural network architectures reveals that current techniques fail to address some of these challenges.
Researcher Affiliation	Collaboration	Claudia Schulz,* Christian M. Meyer, Iryna Gurevych Ubiquitous Knowledge Processing (UKP) Lab Technische Universit at Darmstadt, Germany ... *New afﬁliation: Babylon Health, London, UK
Pseudocode	No	The paper describes the methods textually but does not include any figures, blocks, or sections explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code	Yes	1Code and data available at https://github.com/UKPLab/ aaai19-diagnostic-reasoning
Open Datasets	Yes	We create the ﬁrst corpus for this task, comprising diagnostic reasoning selfexplanations of students from two domains annotated with epistemic activities. ... 1Code and data available at https://github.com/UKPLab/ aaai19-diagnostic-reasoning ... We thus make our corpus and the corresponding analysis software publicly available.
Dataset Splits	Yes	We split our data into 60% train, 20% dev, and 20% test sets, using the same proportion of case scenarios in all splits.
Hardware Specification	No	The paper describes the experimental setup but does not specify any particular GPU models, CPU models, or other hardware components used for running the experiments.
Software Dependencies	No	The paper mentions using 'Reimers and Gurevych s (2017) implementation of a bidirectional long short-term memory (Bi LSTM) network with a conditional random ﬁeld (CRF) output layer' and 'German fast Text word embeddings (Grave et al. 2018)' but does not provide specific version numbers for these software components or the programming language used.
Experiment Setup	Yes	We perform ten runs for each architecture, applying the following parameters for all of them: one hidden layer of 100 units, variational dropout rates for input and hidden layer of 0.25, and the nadam optimizer.