Benefits of over-parameterization with EM

Authors: Ji Xu, Daniel J. Hsu, Arian Maleki

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The goal of this article is to present theoretical and empirical evidence that over-parameterization can help EM avoid spurious local optima in the log-likelihood. ... For other Gaussian mixtures, we provide empirical evidence that shows similar behavior. ... In this section, we present numerical results that show the value of over-parameterization in some mixture models not covered by our theoretical results.
Researcher Affiliation Academia Ji Xu Columbia University jixu@cs.columbia.edu Daniel Hsu Columbia University djhsu@cs.columbia.edu Arian Maleki Columbia University arian@stat.columbia.edu
Pseudocode No The paper describes algorithms using mathematical equations (e.g., equations 3, 4, 5, 6, 7) but does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code No The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement, or code in supplementary materials) for the source code of the described methodology.
Open Datasets No The paper uses synthetic data generated from Gaussian mixture models as described in equation (2) (e.g., 'y1, ..., yn comprise an i.i.d. sample from a mixture of k Gaussians'). No specific, publicly available, or open dataset with access information (link, DOI, formal citation) is mentioned.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or testing.
Hardware Specification No The paper does not provide any specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup Yes For each case, we run EM with 2500 random initializations and compute the empirical probability of success. When n = 1000, the initial mean parameter is chosen uniformly at random from the sample. When n = ∞, the initial mean parameter is chosen uniformly at random from the rectangle [-2, +2]x[-2, +2]. Specific configurations for 'separation' and 'mixing weight' are given (e.g., 'separation |µ2-µ1| ∈ {1, 2, 4}', 'mixing weight w1 ∈ {0.52, 0.7, 0.9}').