The Segmented iHMM: A Simple, Efficient Hierarchical Infinite HMM
Authors: Ardavan Saeedi, Matthew Hoffman, Matthew Johnson, Ryan Adams
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply the model to three different tasks: a novel task of segmenting traces of user behavior in software applications, automatic behavioral segmentation of fruit fly and biometric sensor data labeling. We empirically compare our model with two main baselines: 1) a two-level Bayesian nonparametric hierarchical HMM (HHMM) introduced by Johnson (2014) that models high-level dynamics as an infinite hidden semi-Markov model (HSMM) and sub-dynamics as an i HMM, and 2) the i HMM. In each of these tasks, we show that our model outperforms the nonparametric HHMM (despite being substantially simpler and faster) and the i HMM. |
| Researcher Affiliation | Collaboration | Ardavan Saeedi ARDAVANS@MIT.EDU Computer Science and Artificial Intelligence Laboratory, MIT Matthew Hoffman MATHOFFM@ADOBE.COM Adobe Research Matthew Johnson MATTJJ@CSAIL.MIT.EDU Harvard University Ryan Adams RPA@SEAS.HARVARD.EDU Harvard University and Twitter |
| Pseudocode | No | The paper describes algorithms using mathematical equations and text, but it does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link regarding the public availability of its source code. |
| Open Datasets | Yes | For the fruit fly behavior segmentation, we use a dataset from Kain et al. (2013); We use a dataset with 12000 time steps, collected from a single user, and model the (normalized) observations (i.e., EDA, BVP and acceleration in 3 dimensions) by a multivariate Gaussian distribution. The hyperparameter setting is similar to that of Section 5.3. collected via Empatica E4 wristband (Empatica, 2015) |
| Dataset Splits | Yes | To choose among different hyperparameter settings, we use the variational lower bound (VLB) as our objective measure. Table 1. Datasets used for experiments (description in text) ... Synthetic 5e3 15% ... Users 1.4e4 10 % ... Drosophila 1e4 15 % ... Sensors 1.2e4 10 % We randomly choose a sequence of size 1400 and form a held-out set. our held-out set is a randomly chosen subsequence with length 1500. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper does not specify software dependencies with version numbers. |
| Experiment Setup | Yes | We run SVI with 10 different seeds for 100 iterations over a set of hyperparameters (see the supplementary material for all the settings which we considered). We choose a threshold of 0.5 for the posterior segmentation probability to identify a time point as a change point. For the observations, we use a multivariate Gaussian likelihood and a conjugate Normal/inverse-Wishart prior. In our i HMM experiments, we include a sticky self-transition bias 1. |