reproducibilityindex.ai

Hierarchical Multi-task Learning for Organization Evaluation of Argumentative Student Essays

Authors: Wei Song, Ziyao Song, Lizhen Liu, Ruiji Fu

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that the multi-task learning based organization evaluation can achieve significant improvements compared with existing work and pipeline baselines.
Researcher Affiliation	Collaboration	1College of Information Engineering and Academy for Multidisciplinary Studies, Capital Normal University, Beijing, China 2State Key Laboratory of Cognitive Intelligence, i FLYTEK Research, China 3i FLYTEK AI Research (Hebei), Langfang, China
Pseudocode	No	The paper describes the model architecture and mathematical equations but does not include a dedicated pseudocode or algorithm block.
Open Source Code	No	The paper does not provide any statement or link regarding the availability of open-source code for the described methodology.
Open Datasets	No	The paper states, "We built a dataset of more than 1,200 argumentative student essays with sentence functions, paragraph functions and organization grades annotated," but does not provide concrete access information (e.g., URL, DOI, specific citation for public access) for this newly built dataset.
Dataset Splits	Yes	We split our dataset into ﬁve folds, which have similar distributions over organization grades. Cross-validation was conducted and the average performance would be reported. During training, we randomly selected 10% of the training data as the validation set to ﬁnd the optimal hyper-parameters.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models) used to run the experiments.
Software Dependencies	No	The paper mentions general components like "RNN encoder", "LSTM", and "stochastic gradient descent (SGD)" and refers to "Tencent pre-trained word embeddings" and the "transformer model", but it does not specify software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup	Yes	The maximum number of words in a sentence is set to 40. The maximum numbers of sentences, paragraphs and sentences in any paragraph (i.e., n, m and np) are set to 50, 20 and 20 empirically. ... The dimension of all Bi LSTM hidden layers is set to 256. We use the Tencent pre-trained word embeddings for initiation and the dimension is 200 [Song et al., 2018]. The optimizer is stochastic gradient descent (SGD). The two 2d CNN blocks that receives the organization grid have 64 and 32 ﬁlters respectively. The kernel size is 5 5.