Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Global Analysis of Expectation Maximization for Mixtures of Two Gaussians
Authors: Ji Xu, Daniel J. Hsu, Arian Maleki
NeurIPS 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This article addresses this disconnect between the statistical principles behind EM and its algorithmic properties. Specifically, it provides a global analysis of EM for specific models in which the observations comprise an i.i.d. sample from a mixture of two Gaussians. This is achieved by (i) studying the sequence of parameters from idealized execution of EM in the infinite sample limit, and fully characterizing the limit points of the sequence in terms of the initial parameters; and then (ii) based on this convergence analysis, establishing statistical consistency (or lack thereof) for the actual sequence of parameters produced by EM.Our main contribution in this paper is a new characterization of the stationary points and dynamics of EM in both of the above models. |
| Researcher Affiliation | Academia | Ji Xu Columbia University EMAIL Daniel Hsu Columbia University EMAIL Arian Maleki Columbia University EMAIL |
| Pseudocode | No | The paper describes algorithmic steps for EM, but it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper refers to 'i.i.d. sample from a mixture of two Gaussians' as part of its theoretical model, but does not specify a concrete, publicly available dataset with access information (link, DOI, citation). |
| Dataset Splits | No | As a theoretical paper, there is no mention of training, validation, or test dataset splits for empirical experimentation. |
| Hardware Specification | No | The paper is theoretical and does not mention any specific hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not specify any software dependencies with version numbers. |
| Experiment Setup | No | The paper focuses on theoretical analysis and does not describe an experimental setup with specific hyperparameters or system-level training settings. |