reproducibilityindex.ai

Learning Hidden Markov Models When the Locations of Missing Observations are Unknown

Authors: Binyamin Perets, Mark Kozdoba, Shie Mannor

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate and compare the algorithms in a variety of scenarios, measuring their reconstruction precision, and robustness under model miss-specification. Notably, we show that under proper specifications one can reconstruct the process dynamics as well as if the missing observations positions were known.
Researcher Affiliation	Academia	1Technion Israel Institute of Technology, Haifa, Israel.
Pseudocode	Yes	Algorithm 1 Gibbs sampler given N [...] Algorithm 2 M-H sampler For W [...] Algorithm 3 Gibbs sampler given PC
Open Source Code	Yes	Code. To the best of our knowledge, our implementation is the first publicly available Gibbs sampling-based HMM learning implementation for Python, and the first to handle non-ignorable missing observations in general. The code is provided in the supplementary material and will be made publicly available with the final version of the paper.
Open Datasets	Yes	4. Experiments [...] The following models were used to generate the data (full details are given in Supplementary Material Section G): [...] 4. Part Of Speech process. Transitions and emissions (part of speech and words respectively) probabilities were extracted from the Brown NLP corpus (Francis, 1965).
Dataset Splits	No	The paper describes generating synthetic and semi-synthetic data for evaluation but does not specify train, validation, or test splits. The evaluation focuses on comparing reconstruction performance against ground truth or ideal benchmarks, rather than training/validation/testing a model on pre-divided datasets.
Hardware Specification	No	The paper does not provide specific details on the hardware used for running the experiments. It only mentions estimated computation times: 'the most computationally demanding experiment conducted in this paper [...] would take an estimated 8 minutes.'
Software Dependencies	No	The paper mentions 'Python' as the language for their implementation but does not specify a version. It also mentions the 'Pomegranate package (Schreiber, 2016)' as a comparison point but does not provide its version number or list it as a dependency with a specific version.
Experiment Setup	Yes	Unless specified otherwise, 1500 sentence(U) of length 80(N) were sampled. [...] For each run, we measure the quality of the reconstruction by the L1 distance between the reconstructed and the ground truth transition matrices. [...] In this experiment, for each s we sample Ψ(s) Uniform[0.5 ϵ, 0.5 + ϵ] for varying ϵ (X-axis). [...] Specifically, we randomly generate a single Ψ(s) Uniform[0.35, 0.65], and then delete a proportion, ϵ, of entries. [...] For the Gaps sampler, only one step of W sampling is required by design per X sample. [...] The section begin with addressing the initialization of the sampler for HMMOPs reconstruction, proceed to describe the known location Gibbs sampler while assuming W is known, and finally described the process of sampling W with more details.