reproducibilityindex.ai

Knowledge-Aware Bayesian Deep Topic Model

Authors: Dongsheng Wang, Yi.shi Xu, Miaoge Li, Zhibin Duan, Chaojie Wang, Bo Chen, Mingyuan Zhou

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation. In this section, we conduct extensive experiments on several benchmark text datasets to evaluate the performance of the proposed models against other knowledge-based TMs, in terms of topic interpretability and document representations.
Researcher Affiliation	Academia	Dongsheng Wang, Yishi Xu, Miaoge Li, Zhibin Duan, Chaojie Wang, Bo Chen National Laboratory of Radar Signal Processing Xidian University, Xi an, Shanxi 710071, China {wds,xuyishi,limiaoge,zhibinduan}@stu.xidian.edu.cn xd_silly@163.com, bchen@mail.xidian.edu.cn Mingyuan Zhou Mc Combs School of Business The University of Texas at Austin, Austin, TX 78712, USA mingyuan.zhou@mccombs.utexas.edu
Pseudocode	No	The paper describes algorithms and models, but does not contain a clearly labeled pseudocode block or algorithm.
Open Source Code	Yes	The code is available at https://github.com/wds2014/Topic KG.
Open Datasets	Yes	Our experiments are conducted on four widely used benchmark text datasets, varying in scale. The datasets include 20 Newsgroups (20NG) [24], Reuters extracted from the Reuters-21578 dataset, R8, and Reuters Corpus Volume 2 (RCV2) [25].
Dataset Splits	No	The paper mentions datasets used but does not explicitly provide the training, validation, and test split percentages or sample counts in the main text.
Hardware Specification	Yes	All experiments are performed on an Nvidia RTX 3090-Ti GPU and our proposed models are implemented with Py Torch.
Software Dependencies	No	The paper mentions that models are 'implemented with Py Torch' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	For all experiments, we set the embedding dimension as d = 50, the knowledge confidence hyperparameter as β = 50.0, the threshold as s = 0.4. We initialize the node embedding from the Gaussian distribution N(0, 0.02). We set the batch size as 200 and use the Adam W [26] optimizer with learning rate 0.01.