Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Convergence Rates for Gaussian Mixtures of Experts

Authors: Nhat Ho, Chiao-Yu Yang, Michael I. Jordan

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We provide a theoretical treatment of over-specified Gaussian mixtures of experts with covariate-free gating networks. We establish the convergence rates of the maximum likelihood estimation (MLE) for these models. Our proof technique is based on a novel notion of algebraic independence of the expert functions. Drawing on optimal transport, we establish a connection between the algebraic independence of the expert functions and a certain class of partial differential equations (PDEs) with respect to the parameters. Exploiting this connection allows us to derive convergence rates for parameter estimation.
Researcher Affiliation Academia Nhat Ho EMAIL Division of Statistics and Data Sciences University of Texas Austin, TX 78712, USA Chiao-Yu Yang EMAIL Department of Statistics University of California Berkeley, CA 94720-1776, USA Michael I. Jordan EMAIL Division of Computer Science and Department of Statistics University of California Berkeley, CA 94720-1776, USA
Pseudocode No The paper focuses on theoretical derivations and proofs of convergence rates for Gaussian mixtures of experts. It does not present any algorithms or procedures in a pseudocode or structured block format.
Open Source Code No The paper does not contain any explicit statement about releasing source code or provide links to a code repository.
Open Datasets No The paper is theoretical and does not describe any experiments that would involve the use or release of datasets.
Dataset Splits No The paper is theoretical and does not conduct experiments with datasets, therefore, no dataset splits are mentioned.
Hardware Specification No The paper presents theoretical research and does not describe any experimental setup or hardware used for computations.
Software Dependencies No The paper focuses on theoretical analysis and does not mention any specific software dependencies or versions for experimental implementation.
Experiment Setup No The paper provides a theoretical treatment of Gaussian mixtures of experts and does not describe any experimental setup, hyperparameters, or training configurations.