reproducibilityindex.ai

Guided Attention Network for Concept Extraction

Authors: Songtao Fang, Zhenya Huang, Ming He, Shiwei Tong, Xiaoqing Huang, Ye Liu, Jie Huang, Qi Liu

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Experiments In this section, we ﬁrst describe the datasets and discuss how to collect clue words, then extensive experiments are conducted to verify the effectiveness of our proposed model. (...) Table 3: Overall performance. (...) Figure 4: The experimental results on MTB
Researcher Affiliation	Collaboration	1Anhui Province Key Laboratory of Big Data Analysis and Application, School of Computer Science and Technology, University of Science and Technology of China 2Department of Electronic Engineering, Shanghai Jiao Tong University 3Didi Chuxing, Beijing, China;
Pseudocode	No	The paper describes the model architecture and components in text and diagrams, but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not include an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We use three datasets to evaluate our model. The details are described as follows: CSEN [Pan and Wang, 2017]: this dataset contains 690 video captions in Massive Open Online Courses (MOOCs) for Computer Science courses. KP-20K [Chen et al., 2018]: KP20K consists of 567,830 high-quality scientiﬁc publications from various computer science domains. MTB [Huang et al., 2019]: this dataset consists of mathematics textbooks for elementary, middle, and high schools.
Dataset Splits	Yes	On all datasets, we use 70% as a training set, 10% as a validation set, and 20% as a testing set.
Hardware Specification	Yes	In our experiments, we run all experiments on one Tesla V100 GPU and 16 Intel CPUs.
Software Dependencies	No	The paper mentions software like word2vec and an LDA implementation in a topic modeling toolkit, but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	In our experiments... The hidden state dimensions of the Bi-LSTM encoder are set to 200. All weight matrices are randomly initialized by a uniform distribution U( 0.1, 0.1). ... The topic numbers are set to 50, 100, and 50 in the CSEN, KP-20K, and MTB respectively. The aggregation parameter 𝜆in the attention layer is set to 0.5. The model is optimized by Adam with batch size 10 and dropout rate=0.1. Soft Matching Module Hyperparameters. The threshold 𝜃 is set to 0.75 and the maximum window size is set to 3.