Poisson-Gamma dynamical systems

Authors: Aaron Schein, Hanna Wallach, Mingyuan Zhou

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we use the PGDS to analyze a diverse range of real-world data sets, showing that it exhibits excellent predictive performance on smoothing and forecasting tasks and infers interpretable latent structure, an example of which is depicted in figure 1. In this section, we compare the predictive performance of the PGDS to that of the LDS and that of gamma process dynamic Poisson factor analysis (GP-DPFA) [22].
Researcher Affiliation Collaboration Aaron Schein College of Information and Computer Sciences University of Massachusetts Amherst Amherst, MA 01003 aschein@cs.umass.edu, Mingyuan Zhou Mc Combs School of Business The University of Texas at Austin Austin, TX 78712 mingyuan.zhou@mccombs.utexas.edu, Hanna Wallach Microsoft Research New York 641 Avenue of the Americas New York, NY 10011 hanna@dirichlet.net
Pseudocode No While Figure 2 provides an 'Alternative model specification' with a graphical representation of the generative process, it does not present structured pseudocode or a traditional algorithm block with numbered or bulleted steps.
Open Source Code No The paper does not provide an explicit statement or a link to open-source code for the methodology described.
Open Datasets Yes Global Database of Events, Language, and Tone (GDELT), Integrated Crisis Early Warning System (ICEWS), State-of-the-Union transcripts (SOTU), DBLP conference abstracts (DBLP): We used the subset of this corpus that Acharya et al. used to evaluate GP-DPFA [22]. NIPS corpus (NIPS): The NIPS corpus contains the text of every NIPS conference paper from 1987 to 2003.
Dataset Splits Yes For each matrix, we created four masks indicating some randomly selected subset of columns to treat as held-out data. For the event count matrices, we held out six (noncontiguous) time steps between t = 2 and t = T 3 to test the models smoothing performance, as well as the last two time steps to test their forecasting performance. The other matrices have fewer time steps. For the SOTU matrix, we therefore held out five time steps between t = 2 and t = T 2, as well as t = T. For the NIPS and DBLP matrices, which contain substantially fewer time steps than the SOTU matrix, we held out three time steps between t = 2 and t = T 2, as well as t = T.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions 'We used the pykalman Python library for the LDS' but does not specify a version number. No other software dependencies with specific version numbers are provided.
Experiment Setup Yes For the PGDS and GP-DPFA, we performed 6,000 Gibbs sampling iterations, imputing the missing counts from the smoothing columns at the same time as sampling the model parameters. We then discarded the first 4,000 samples and retained every hundredth sample thereafter. We used each of these samples to predict the missing counts from the forecasting columns. We then averaged the predictions over the samples. For the LDS, we ran EM to learn the model parameters. Then, given these parameter values, we used the Kalman filter and smoother [1] to predict the held-out data. For the PGDS and GP-DPFA we used K = 100. For the PGDS, we set τ0 = 1, γ0 = 50, η0 = ϵ0 = 0.1. We set the hyperparameters of GP-DPFA to the values used by Acharya et al. [22]. For the LDS, we used the default hyperparameters for pykalman, and report results for the best-performing value of K {5, 10, 25, 50}.