Benefits of over-parameterization with EM
Authors: Ji Xu, Daniel J. Hsu, Arian Maleki
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The goal of this article is to present theoretical and empirical evidence that over-parameterization can help EM avoid spurious local optima in the log-likelihood. ... For other Gaussian mixtures, we provide empirical evidence that shows similar behavior. ... In this section, we present numerical results that show the value of over-parameterization in some mixture models not covered by our theoretical results. |
| Researcher Affiliation | Academia | Ji Xu Columbia University jixu@cs.columbia.edu Daniel Hsu Columbia University djhsu@cs.columbia.edu Arian Maleki Columbia University arian@stat.columbia.edu |
| Pseudocode | No | The paper describes algorithms using mathematical equations (e.g., equations 3, 4, 5, 6, 7) but does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block with structured steps. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement, or code in supplementary materials) for the source code of the described methodology. |
| Open Datasets | No | The paper uses synthetic data generated from Gaussian mixture models as described in equation (2) (e.g., 'y1, ..., yn comprise an i.i.d. sample from a mixture of k Gaussians'). No specific, publicly available, or open dataset with access information (link, DOI, formal citation) is mentioned. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide any specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | Yes | For each case, we run EM with 2500 random initializations and compute the empirical probability of success. When n = 1000, the initial mean parameter is chosen uniformly at random from the sample. When n = ∞, the initial mean parameter is chosen uniformly at random from the rectangle [-2, +2]x[-2, +2]. Specific configurations for 'separation' and 'mixing weight' are given (e.g., 'separation |µ2-µ1| ∈ {1, 2, 4}', 'mixing weight w1 ∈ {0.52, 0.7, 0.9}'). |