On collapsed representation of hierarchical Completely Random Measures
Authors: Gaurav Pandey, Ambedkar Dukkipati
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Experimental results We use hierarchical CRM-Poisson models for learning topics from the NIPS corpus 1. and The perplexity for the hierarchical CRM-Poisson models as a function of training percentage is plotted in Figure 1. |
| Researcher Affiliation | Academia | Gaurav Pandey GP88@CSA.IISC.ERNET.IN Ambedkar Dukkipati AD@CSA.IISC.ERNET.IN Department of Computer Science and Automation Indian Institute of Science, Bangalore-560012, India |
| Pseudocode | No | The paper describes the steps for Gibbs sampling in a numbered list, but it does not present structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described, nor does it explicitly state that the code is being released. |
| Open Datasets | Yes | We use hierarchical CRM-Poisson models for learning topics from the NIPS corpus 1. 1The dataset can be downloaded from http: //psiexp.ss.uci.edu/research/programs_data/ toolbox.htm |
| Dataset Splits | No | The paper states: 'For evaluating the different models, we divide each document into a training section and a test section by independently sampling a boolean random variable for each word. The probability of sending the word to the training section is varied from 0.3 to 0.7.' It does not explicitly mention a validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | We run 2000 iterations of Gibbs sampling. The first 500 iterations are discarded, and every sample in every 5 iterations afterwards is used to update the document-specific distribution on topics and the topic specific distribution on words. and For the case of GGP, the value of the discount parameter d is chosen from the set {0, .1, .2, .3, .4}. Furthermore, a gamma prior with rate parameter 2 and shape parameter 4 is defined on θ. and For the case of SGGP, we consider m = 5, and d1 = 0, d2 = .1 . . . , d5 = .4. Furthermore, independent gamma priors with rate parameter 2 and shape parameter 4 are defined for each θq, 1 q 5. The posterior of each parameter θq is sampled via uniform sampling. |