Collapsed variational Bayes for Markov jump processes

Authors: Boqian Zhang, Jiangwei Pan, Vinayak A. Rao

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our ideas to synthetic data as well as a dataset of check-in recordings, where we demonstrate superior performance over state-of-the-art MCMC methods. We present qualitative and quantitative experiments using synthetic and real datasets to demonstrate the accuracy and efficiency of our variational Bayes (VB) algorithm. Datasets. We use a dataset of check-in sequences from 8967 Four Square users in the year 2011
Researcher Affiliation Academia Jiangwei Pan Department of Computer Science Duke University panjiangwei@gmail.com Boqian Zhang Department of Statistics Purdue University zhan1977@purdue.edu Vinayak Rao Department of Statistics Purdue University varao@purdue.edu
Pseudocode Yes The paper describes the Gillespie algorithm steps: “1. First, at time t = 0, sample an initial state s0 from π. 2. From here onwards, upon entering a new state i, sample the time of the next transition from an exponential with rate |Aii|, and then a new state j = i with probability proportional to Aij.” It also describes its variational inference algorithm in structured steps: “1) Updating q(U|T) = Q|T | t=1 q(ut): Given a discretization T, and an Ω, uniformization tells us that inference over U is just inference for a discrete-time hidden Markov model. ... 2) Updating q(T): We perform a greedy search over the space of time-discretizations by making local stochastic updates to the current T. Every iteration, we first scan the current T to find a beneficial merge... If no merge is found, we then try to find a beneficial split.”
Open Source Code No No statement or link is provided indicating that the source code for the methodology is openly available.
Open Datasets Yes We use a dataset of check-in sequences from 8967 Four Square users in the year 2011, originally collected by Gao et al. (2012) for studying location-based social networks.
Dataset Splits No The paper states: “We randomly select 100 test sequences, and randomly hold out half of the observations in each test sequence. The training data consists of the observations that are not held out, i.e., 100 full sequences and 100 half sequences.” This describes train and test, but no separate validation set or split information.
Hardware Specification No No specific hardware details (like GPU/CPU models or cloud instance types) are provided.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes For VB on synthetic datasets we place a Gamma(20, 2) prior on Ω, and Dirichlet(2) priors on the transition probabilities and the observation probabilities, while on the check-in data, a Gamma(6, 1), a Dirichlet(0.1) and a Dirichlet(0.01) are placed. For MCMC on synthetic datasets, we place a Gamma(2, 0.2) and a Dirichlet(0.1) for the rate matrix, while on the check-in data, a Gamma(1, 1) and a Dirichlet(0.1) are placed. We run VB on the first synthetic dataset for 200 iterations, after which we use the posterior expected counts of observations in each state to infer the output emission probabilities. We run the VB algorithm on the check-in data using 50 states for 200 iterations.