Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

State-by-state Minimax Adaptive Estimation for Nonparametric Hidden {M}arkov Models

Authors: Luc Lehéricy

JMLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we introduce a new estimator for the emission densities of a nonparametric hidden Markov model. It is adaptive and minimax with respect to each state s regularity as opposed to globally minimax estimators, which adapt to the worst regularity among the emission densities. Our method is based on Goldenshluger and Lepski s methodology. It is computationally efficient and only requires a family of preliminary estimators, without any restriction on the type of estimators considered. We present two such estimators that allow to reach minimax rates up to a logarithmic term: a spectral estimator and a least squares estimator. We show how to calibrate it in practice and assess its performance on simulations and on real data.
Researcher Affiliation Academia Luc Leh ericy EMAIL Laboratoire de Math ematiques d Orsay Univ. Paris-Sud, CNRS, Universit e Paris-Saclay 91405 Orsay, France
Pseudocode Yes Algorithm 1: Spectral estimation of the emission densities of a HMM (short version) Algorithm 2: Least squares estimation of the emission densities of a HMM Algorithm 3: Spectral estimation of the emission densities of a HMM (full version)
Open Source Code No The paper does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described. It mentions a third-party R implementation, but not the authors' own code release.
Open Datasets Yes In Section 5, we apply our algorithm to two sets of GPS tracks. The first data set contains trajectories of artisanal fishers from Madagascar... The second data set contains GPS positions of Peruvian seabird, recorded with 1 second time steps... We consider the seabird data from Bertrand et al. (2015) and we focus on the tracks named cormorant d in this paper.
Dataset Splits Yes We use 10-fold cross validation, that is we split the sequence into 10 segments of same size I1, . . . , I10. In order to avoid interferences between samples, we prune the ends of each segment, so that the observations in each segment can be considered independent. In practice, we take a gap of 30 observations between two segments.
Hardware Specification No The paper extensively discusses computational complexity and time taken for different parts of the algorithms (e.g., 'a few minutes' or 'a couple of seconds' for certain operations), but it does not specify any particular hardware components like CPU models, GPU models, or memory specifications used for these computations or experiments.
Software Dependencies No The paper mentions 'a R implementation of the spectral algorithm' but does not specify the version of R or any other software packages used for the experiments.
Experiment Setup Yes We take Y = [0, 1] equipped with the Lebesgue measure. We choose the approximation spaces spanned by a trigonometric basis: PM := Span(ϕ1, . . . , ϕM) with ϕ1(x) = 1 ϕ2m(x) = 2 cos(2πmx) ϕ2m+1(x) = 2 sin(2πmx) for all x [0, 1] and m N . We will consider a hidden Markov model with K = 3 hidden states and the following parameters: Transition matrix ... Emission densities ... We generate n observations and run the spectral algorithm in order to obtain estimators for the models PM with M_min M M_max, m = 20 and r = 2 log(n)+2 log(M) , where M_min = 3 and M_max = 300. Finally, we use the state-by-state selection method to choose the final estimator for each emission density. ... We made 300 simulations, 20 per value of n, with n taking values in {5 104, 7 104, 1 105, 1.5 105, 2.2 105, 3.5 105, 5 105, 7 105, 1 106, 1.5 106, 2.2 106, 3.5 106, 5 106, 7 106, 1 107}.