reproducibilityindex.ai

Efficient Learning of Continuous-Time Hidden Markov Models for Disease Progression

Authors: Yu-Ying Liu, Shuang Li, Fuxin Li, Le Song, James M. Rehg

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the use of CT-HMMs with more than 100 states to visualize and predict disease progression using a glaucoma dataset and an Alzheimer s disease dataset. 5 Experimental results We evaluated our EM algorithms in simulation (Sec. 5.1) and on two real-world datasets: a glaucoma dataset (Sec. 5.2) in which we compare our prediction performance to a state-of-the-art method, and a dataset for Alzheimer s disease (AD, Sec. 5.3) where we compare visualized progression trends to recent ﬁndings in the literature.
Researcher Affiliation	Academia	Yu-Ying Liu, Shuang Li, Fuxin Li, Le Song, and James M. Rehg College of Computing Georgia Institute of Technology Atlanta, GA
Pseudocode	Yes	Algorithm 1 CT-HMM Parameter learning (Soft/Hard) Algorithm 2 The Expm Algorithm for Computing End-State Conditioned Statistics
Open Source Code	No	The paper does not provide a direct link or explicit statement about the availability of its source code.
Open Datasets	Yes	We evaluated our EM algorithms in simulation (Sec. 5.1) and on two real-world datasets: a glaucoma dataset (Sec. 5.2) ... and an Alzheimer s disease dataset (AD, Sec. 5.3). The Alzheimers Disease Neuroimaging Initiative, http://adni.loni.usc.edu
Dataset Splits	No	The paper mentions synthetic data simulation and general use of 'training set' and 'testing patient' but does not provide specific percentages, sample counts, or citations for train/validation/test splits on the real-world datasets.
Hardware Specification	No	On the glaucoma dataset from Section 5.2, using a model with 105 states, Soft Expm requires 18 minutes per iteration on a 2.67 GHz machine with unoptimized MATLAB code. This describes a CPU frequency but does not specify a make, model, or other hardware components like GPU or RAM.
Software Dependencies	No	The paper mentions "unoptimized MATLAB code" but does not specify the version of MATLAB or any other software dependencies with version numbers.
Experiment Setup	Yes	We test the accuracy of all methods on a 5-state complete digraph with synthetic data generated under different noise levels. Each qi is randomly drawn from [1, 5] and then qij is drawn from [0, 1] and renormalized such that Pj=i qij = qi. The state chains are generated from Q, such that each chain has a total duration around T = 100 mini qi , where 1 mini qi is the largest mean holding time. The data emission model for state i is set as N(i, σ2), where σ varies under different noise level settings. The observations are then sampled from the state chains with rate 0.5 maxi qi , where 1 maxi qi is the smallest mean holding time, which should be dense enough to make the chain identiﬁable. A total of 105 observations are sampled. The convergence threshold is 10 8 on relative data likelihood change.