Knowledge-Aware Bayesian Deep Topic Model
Authors: Dongsheng Wang, Yi.shi Xu, Miaoge Li, Zhibin Duan, Chaojie Wang, Bo Chen, Mingyuan Zhou
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation. In this section, we conduct extensive experiments on several benchmark text datasets to evaluate the performance of the proposed models against other knowledge-based TMs, in terms of topic interpretability and document representations. |
| Researcher Affiliation | Academia | Dongsheng Wang, Yishi Xu, Miaoge Li, Zhibin Duan, Chaojie Wang, Bo Chen National Laboratory of Radar Signal Processing Xidian University, Xi an, Shanxi 710071, China {wds,xuyishi,limiaoge,zhibinduan}@stu.xidian.edu.cn xd_silly@163.com, bchen@mail.xidian.edu.cn Mingyuan Zhou Mc Combs School of Business The University of Texas at Austin, Austin, TX 78712, USA mingyuan.zhou@mccombs.utexas.edu |
| Pseudocode | No | The paper describes algorithms and models, but does not contain a clearly labeled pseudocode block or algorithm. |
| Open Source Code | Yes | The code is available at https://github.com/wds2014/Topic KG. |
| Open Datasets | Yes | Our experiments are conducted on four widely used benchmark text datasets, varying in scale. The datasets include 20 Newsgroups (20NG) [24], Reuters extracted from the Reuters-21578 dataset, R8, and Reuters Corpus Volume 2 (RCV2) [25]. |
| Dataset Splits | No | The paper mentions datasets used but does not explicitly provide the training, validation, and test split percentages or sample counts in the main text. |
| Hardware Specification | Yes | All experiments are performed on an Nvidia RTX 3090-Ti GPU and our proposed models are implemented with Py Torch. |
| Software Dependencies | No | The paper mentions that models are 'implemented with Py Torch' but does not specify a version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For all experiments, we set the embedding dimension as d = 50, the knowledge confidence hyperparameter as β = 50.0, the threshold as s = 0.4. We initialize the node embedding from the Gaussian distribution N(0, 0.02). We set the batch size as 200 and use the Adam W [26] optimizer with learning rate 0.01. |