Guided Attention Network for Concept Extraction

Authors: Songtao Fang, Zhenya Huang, Ming He, Shiwei Tong, Xiaoqing Huang, Ye Liu, Jie Huang, Qi Liu

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments In this section, we first describe the datasets and discuss how to collect clue words, then extensive experiments are conducted to verify the effectiveness of our proposed model. (...) Table 3: Overall performance. (...) Figure 4: The experimental results on MTB
Researcher Affiliation Collaboration 1Anhui Province Key Laboratory of Big Data Analysis and Application, School of Computer Science and Technology, University of Science and Technology of China 2Department of Electronic Engineering, Shanghai Jiao Tong University 3Didi Chuxing, Beijing, China;
Pseudocode No The paper describes the model architecture and components in text and diagrams, but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not include an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We use three datasets to evaluate our model. The details are described as follows: CSEN [Pan and Wang, 2017]: this dataset contains 690 video captions in Massive Open Online Courses (MOOCs) for Computer Science courses. KP-20K [Chen et al., 2018]: KP20K consists of 567,830 high-quality scientific publications from various computer science domains. MTB [Huang et al., 2019]: this dataset consists of mathematics textbooks for elementary, middle, and high schools.
Dataset Splits Yes On all datasets, we use 70% as a training set, 10% as a validation set, and 20% as a testing set.
Hardware Specification Yes In our experiments, we run all experiments on one Tesla V100 GPU and 16 Intel CPUs.
Software Dependencies No The paper mentions software like word2vec and an LDA implementation in a topic modeling toolkit, but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes In our experiments... The hidden state dimensions of the Bi-LSTM encoder are set to 200. All weight matrices are randomly initialized by a uniform distribution U( 0.1, 0.1). ... The topic numbers are set to 50, 100, and 50 in the CSEN, KP-20K, and MTB respectively. The aggregation parameter 𝜆in the attention layer is set to 0.5. The model is optimized by Adam with batch size 10 and dropout rate=0.1. Soft Matching Module Hyperparameters. The threshold 𝜃 is set to 0.75 and the maximum window size is set to 3.