Knowledge-Aware Bayesian Deep Topic Model

Authors: Dongsheng Wang, Yi.shi Xu, Miaoge Li, Zhibin Duan, Chaojie Wang, Bo Chen, Mingyuan Zhou

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation. In this section, we conduct extensive experiments on several benchmark text datasets to evaluate the performance of the proposed models against other knowledge-based TMs, in terms of topic interpretability and document representations.
Researcher Affiliation Academia Dongsheng Wang, Yishi Xu, Miaoge Li, Zhibin Duan, Chaojie Wang, Bo Chen National Laboratory of Radar Signal Processing Xidian University, Xi an, Shanxi 710071, China {wds,xuyishi,limiaoge,zhibinduan}@stu.xidian.edu.cn xd_silly@163.com, bchen@mail.xidian.edu.cn Mingyuan Zhou Mc Combs School of Business The University of Texas at Austin, Austin, TX 78712, USA mingyuan.zhou@mccombs.utexas.edu
Pseudocode No The paper describes algorithms and models, but does not contain a clearly labeled pseudocode block or algorithm.
Open Source Code Yes The code is available at https://github.com/wds2014/Topic KG.
Open Datasets Yes Our experiments are conducted on four widely used benchmark text datasets, varying in scale. The datasets include 20 Newsgroups (20NG) [24], Reuters extracted from the Reuters-21578 dataset, R8, and Reuters Corpus Volume 2 (RCV2) [25].
Dataset Splits No The paper mentions datasets used but does not explicitly provide the training, validation, and test split percentages or sample counts in the main text.
Hardware Specification Yes All experiments are performed on an Nvidia RTX 3090-Ti GPU and our proposed models are implemented with Py Torch.
Software Dependencies No The paper mentions that models are 'implemented with Py Torch' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup Yes For all experiments, we set the embedding dimension as d = 50, the knowledge confidence hyperparameter as β = 50.0, the threshold as s = 0.4. We initialize the node embedding from the Gaussian distribution N(0, 0.02). We set the batch size as 200 and use the Adam W [26] optimizer with learning rate 0.01.