reproducibilityindex.ai

The Segmented iHMM: A Simple, Efficient Hierarchical Infinite HMM

Authors: Ardavan Saeedi, Matthew Hoffman, Matthew Johnson, Ryan Adams

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply the model to three different tasks: a novel task of segmenting traces of user behavior in software applications, automatic behavioral segmentation of fruit ﬂy and biometric sensor data labeling. We empirically compare our model with two main baselines: 1) a two-level Bayesian nonparametric hierarchical HMM (HHMM) introduced by Johnson (2014) that models high-level dynamics as an inﬁnite hidden semi-Markov model (HSMM) and sub-dynamics as an i HMM, and 2) the i HMM. In each of these tasks, we show that our model outperforms the nonparametric HHMM (despite being substantially simpler and faster) and the i HMM.
Researcher Affiliation	Collaboration	Ardavan Saeedi ARDAVANS@MIT.EDU Computer Science and Artiﬁcial Intelligence Laboratory, MIT Matthew Hoffman MATHOFFM@ADOBE.COM Adobe Research Matthew Johnson MATTJJ@CSAIL.MIT.EDU Harvard University Ryan Adams RPA@SEAS.HARVARD.EDU Harvard University and Twitter
Pseudocode	No	The paper describes algorithms using mathematical equations and text, but it does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link regarding the public availability of its source code.
Open Datasets	Yes	For the fruit ﬂy behavior segmentation, we use a dataset from Kain et al. (2013); We use a dataset with 12000 time steps, collected from a single user, and model the (normalized) observations (i.e., EDA, BVP and acceleration in 3 dimensions) by a multivariate Gaussian distribution. The hyperparameter setting is similar to that of Section 5.3. collected via Empatica E4 wristband (Empatica, 2015)
Dataset Splits	Yes	To choose among different hyperparameter settings, we use the variational lower bound (VLB) as our objective measure. Table 1. Datasets used for experiments (description in text) ... Synthetic 5e3 15% ... Users 1.4e4 10 % ... Drosophila 1e4 15 % ... Sensors 1.2e4 10 % We randomly choose a sequence of size 1400 and form a held-out set. our held-out set is a randomly chosen subsequence with length 1500.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper does not specify software dependencies with version numbers.
Experiment Setup	Yes	We run SVI with 10 different seeds for 100 iterations over a set of hyperparameters (see the supplementary material for all the settings which we considered). We choose a threshold of 0.5 for the posterior segmentation probability to identify a time point as a change point. For the observations, we use a multivariate Gaussian likelihood and a conjugate Normal/inverse-Wishart prior. In our i HMM experiments, we include a sticky self-transition bias 1.