Guided Attention Network for Concept Extraction
Authors: Songtao Fang, Zhenya Huang, Ming He, Shiwei Tong, Xiaoqing Huang, Ye Liu, Jie Huang, Qi Liu
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments In this section, we first describe the datasets and discuss how to collect clue words, then extensive experiments are conducted to verify the effectiveness of our proposed model. (...) Table 3: Overall performance. (...) Figure 4: The experimental results on MTB |
| Researcher Affiliation | Collaboration | 1Anhui Province Key Laboratory of Big Data Analysis and Application, School of Computer Science and Technology, University of Science and Technology of China 2Department of Electronic Engineering, Shanghai Jiao Tong University 3Didi Chuxing, Beijing, China; |
| Pseudocode | No | The paper describes the model architecture and components in text and diagrams, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We use three datasets to evaluate our model. The details are described as follows: CSEN [Pan and Wang, 2017]: this dataset contains 690 video captions in Massive Open Online Courses (MOOCs) for Computer Science courses. KP-20K [Chen et al., 2018]: KP20K consists of 567,830 high-quality scientific publications from various computer science domains. MTB [Huang et al., 2019]: this dataset consists of mathematics textbooks for elementary, middle, and high schools. |
| Dataset Splits | Yes | On all datasets, we use 70% as a training set, 10% as a validation set, and 20% as a testing set. |
| Hardware Specification | Yes | In our experiments, we run all experiments on one Tesla V100 GPU and 16 Intel CPUs. |
| Software Dependencies | No | The paper mentions software like word2vec and an LDA implementation in a topic modeling toolkit, but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | In our experiments... The hidden state dimensions of the Bi-LSTM encoder are set to 200. All weight matrices are randomly initialized by a uniform distribution U( 0.1, 0.1). ... The topic numbers are set to 50, 100, and 50 in the CSEN, KP-20K, and MTB respectively. The aggregation parameter 𝜆in the attention layer is set to 0.5. The model is optimized by Adam with batch size 10 and dropout rate=0.1. Soft Matching Module Hyperparameters. The threshold 𝜃 is set to 0.75 and the maximum window size is set to 3. |