The Poisson Gamma Belief Network
Authors: Mingyuan Zhou, Yulai Cong, Bo Chen
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 3 Experimental Results, We apply the PGBNs for topic modeling of text corpora..., We evaluate the PGBNs performance by examining both how well they unsupervisedly extract low-dimensional features for document classification, and how well they predict heldout word tokens. Example results on text analysis illustrate interesting relationships between the width of the first layer and the inferred network structure, and demonstrate that the PGBN, whose hidden units are imposed with correlated gamma priors, can add more layers to increase its performance gains over Poisson factor analysis, given the same limit on the width of the first layer. |
| Researcher Affiliation | Academia | Mingyuan Zhou Mc Combs School of Business The University of Texas at Austin Austin, TX 78712, USA, Yulai Cong National Laboratory of RSP Xidian University Xi an, Shaanxi, China, Bo Chen National Laboratory of RSP Xidian University Xi an, Shaanxi, China |
| Pseudocode | Yes | Algorithm 1 The PGBN upward-downward Gibbs sampler that uses a layer-wise training strategy to train a set of networks, each of which adds an additional hidden layer on top of the previously inferred network, retrains all its layers jointly, and prunes inactive factors from the last layer. |
| Open Source Code | Yes | Matlab code will be available in http://mingyuanzhou.github.io/. |
| Open Datasets | Yes | We consider the 20 newsgroups dataset (http://qwone.com/ jason/20Newsgroups/), We consider the NIPS12 (http://www.cs.nyu.edu/ roweis/data.html) corpus |
| Dataset Splits | Yes | It is partitioned into a training set of 11,269 documents and a testing set of 7,505 ones., randomly choose 30% of the word tokens in each document as training, and use the remaining ones to calculate per-heldoutword perplexity., the regularization parameter is five-folder cross-validated on the training set from (2 10, 2 9, . . . , 215). |
| Hardware Specification | Yes | e.g., with K1 max = 400, a training iteration on a single core of an Intel Xeon 2.7 GHz CPU on average takes about 5.6, 6.7, 7.1 seconds for the PGBN with 1, 3, and 5 layers, respectively. |
| Software Dependencies | No | Matlab code will be available in http://mingyuanzhou.github.io/., We use the L2 regularized logistic regression provided by the LIBLINEAR package [25]. No specific version numbers are provided for Matlab or LIBLINEAR. |
| Experiment Setup | Yes | We set the hyper-parameters as a0 = b0 = 0.01 and e0 = f0 = 1. Given the trained network, we apply the upward-downward Gibbs sampler to collect 500 MCMC samples after 500 burnins to estimate the posterior mean of the feature usage proportion vector θ(1) j /θ(1) j at the first hidden layer, for every document in both the training and testing sets., with Bt = Ct = 1000 and η(t) = 0.01 for all t, We set Ct = 500 and η(t) = 0.05 for all t. If K1 max 400, we set Bt = 1000 for all t, otherwise we set B1 = 1000 and Bt = 500 for t 2. |