Dirichlet belief networks for topic structure learning
Authors: He Zhao, Lan Du, Wray Buntine, Mingyuan Zhou
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on text corpora demonstrate the advantages of the proposed model. |
| Researcher Affiliation | Academia | 1Faculty of Information Technology, Monash University, Australia 2Mc Combs School of Business, The University of Texas at Austin, USA |
| Pseudocode | Yes | Omitted details of inference as well as the overall algorithm are given in the supplementary materials. |
| Open Source Code | Yes | Code available at https://github.com/ethanhezhao/Dir BN |
| Open Datasets | Yes | The experiments were conducted on three real-world datasets, detailed as follows: 1) Web Snippets (WS), containing 12,237 web search snippets labelled with 8 categories... 2) Tag My News (TMN), consisting of 32,597 RSS news labelled with 7 categories... 3) Twitter, extracted in 2011 and 2012 microblog tracks at Text REtrieval Conference (TREC)5. It has 11,109 tweets in total... Footnote 5: http://trec.nist.gov/data/microblog.html |
| Dataset Splits | Yes | To compute perplexity, we randomly selected 80% of the documents in each dataset to train the models and 20% for testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper mentions “Mallet” but does not provide specific version numbers for any software, libraries, or programming languages used for implementation or experimentation. |
| Experiment Setup | Yes | For all the models, we ran 3,000 MCMC iterations with 1,500 burnin. For Dir BN, we set a0 = b0 = g0 = h0 = 1.0 and e0 = f0 = 0.01... For all the models, the number of topics in each layer of Dir BN was set to 100, i.e., KT = = K1 = 100. |