The Dynamic Chinese Restaurant Process via Birth and Death Processes
Authors: Rui Huang, Fengyuan Zhu, Pheng-Ann Heng
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also conduct simulation and empirical studies to compare this model with traditional CRP and related models. The results show that this model can provide better results for sequential data, especially for data with heterogeneous lifetime distribution. ... In this section, we perform a simulation study to demonstrate the application of the DCRP mixture model for modeling evolutionary clustered data. Also, we compare the experiment results with the following benchmark models: the traditional CRP (CRP), the recurrent CRP (r CRP) and the Distance-dependent CRP (dd-CRP)... This section evaluates the proposed DCRP mixture model on two datasets and compares the results with the traditional CRP, the r CRP, the dd-CRP as well as two state-of-the-art dynamic topic models: the topic over time (TOT) (Wang and Mc Callum 2006) and the continuous time dynamic topic model (CTDTM) (Wang, Blei, and Heckerman 2012). |
| Researcher Affiliation | Academia | Rui Huang Department of Statistics The Chinese University of Hong Kong Fengyuan Zhu and Pheng-Ann Heng Department of Computer Science and Engineering The Chinese University of Hong Kong |
| Pseudocode | No | The paper describes the steps of the Gibbs Sampling Algorithms in Section 2.4, but it does so in narrative text rather than in a structured pseudocode block or a clearly labeled algorithm section. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described, nor does it include explicit statements about code availability or repository links. |
| Open Datasets | No | For the 'Twitter' dataset, the paper states 'We randomly select 10,000 Twitter users...', indicating a custom dataset with no public access information. For the 'NIPS' dataset, it states 'We use dataset of all NIPS papers from 1987 to 2003', which is a known collection of papers, but no specific citation, link, or repository for the curated dataset used in the experiments is provided, failing to meet the criteria for concrete access. |
| Dataset Splits | Yes | For each dataset, we remove the last 20 percent of data for cross validation. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., exact GPU/CPU models, memory amounts, or cloud instance types) used for running its experiments. Vague terms like 'on a GPU' are not present either. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., specific library names like 'PyTorch 1.9' or solver versions like 'CPLEX 12.4'). |
| Experiment Setup | Yes | We use the same base distribution H for each model and the concentration parameter α of each model is set to be 2. ... In our Gibbs sampler we ran three chains with different initial values and a sample of size 2000 is collected after 1200 iterations (burn-in time). ... The burn-in time is set to be 1200 iterations and the sample size is 2000. |