Hierarchical Multi-task Learning for Organization Evaluation of Argumentative Student Essays

Authors: Wei Song, Ziyao Song, Lizhen Liu, Ruiji Fu

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that the multi-task learning based organization evaluation can achieve significant improvements compared with existing work and pipeline baselines.
Researcher Affiliation Collaboration 1College of Information Engineering and Academy for Multidisciplinary Studies, Capital Normal University, Beijing, China 2State Key Laboratory of Cognitive Intelligence, i FLYTEK Research, China 3i FLYTEK AI Research (Hebei), Langfang, China
Pseudocode No The paper describes the model architecture and mathematical equations but does not include a dedicated pseudocode or algorithm block.
Open Source Code No The paper does not provide any statement or link regarding the availability of open-source code for the described methodology.
Open Datasets No The paper states, "We built a dataset of more than 1,200 argumentative student essays with sentence functions, paragraph functions and organization grades annotated," but does not provide concrete access information (e.g., URL, DOI, specific citation for public access) for this newly built dataset.
Dataset Splits Yes We split our dataset into five folds, which have similar distributions over organization grades. Cross-validation was conducted and the average performance would be reported. During training, we randomly selected 10% of the training data as the validation set to find the optimal hyper-parameters.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models) used to run the experiments.
Software Dependencies No The paper mentions general components like "RNN encoder", "LSTM", and "stochastic gradient descent (SGD)" and refers to "Tencent pre-trained word embeddings" and the "transformer model", but it does not specify software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup Yes The maximum number of words in a sentence is set to 40. The maximum numbers of sentences, paragraphs and sentences in any paragraph (i.e., n, m and np) are set to 50, 20 and 20 empirically. ... The dimension of all Bi LSTM hidden layers is set to 256. We use the Tencent pre-trained word embeddings for initiation and the dimension is 200 [Song et al., 2018]. The optimizer is stochastic gradient descent (SGD). The two 2d CNN blocks that receives the organization grid have 64 and 32 filters respectively. The kernel size is 5 5.