Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Convergence Rates for Gaussian Mixtures of Experts
Authors: Nhat Ho, Chiao-Yu Yang, Michael I. Jordan
JMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We provide a theoretical treatment of over-specified Gaussian mixtures of experts with covariate-free gating networks. We establish the convergence rates of the maximum likelihood estimation (MLE) for these models. Our proof technique is based on a novel notion of algebraic independence of the expert functions. Drawing on optimal transport, we establish a connection between the algebraic independence of the expert functions and a certain class of partial differential equations (PDEs) with respect to the parameters. Exploiting this connection allows us to derive convergence rates for parameter estimation. |
| Researcher Affiliation | Academia | Nhat Ho EMAIL Division of Statistics and Data Sciences University of Texas Austin, TX 78712, USA Chiao-Yu Yang EMAIL Department of Statistics University of California Berkeley, CA 94720-1776, USA Michael I. Jordan EMAIL Division of Computer Science and Department of Statistics University of California Berkeley, CA 94720-1776, USA |
| Pseudocode | No | The paper focuses on theoretical derivations and proofs of convergence rates for Gaussian mixtures of experts. It does not present any algorithms or procedures in a pseudocode or structured block format. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide links to a code repository. |
| Open Datasets | No | The paper is theoretical and does not describe any experiments that would involve the use or release of datasets. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments with datasets, therefore, no dataset splits are mentioned. |
| Hardware Specification | No | The paper presents theoretical research and does not describe any experimental setup or hardware used for computations. |
| Software Dependencies | No | The paper focuses on theoretical analysis and does not mention any specific software dependencies or versions for experimental implementation. |
| Experiment Setup | No | The paper provides a theoretical treatment of Gaussian mixtures of experts and does not describe any experimental setup, hyperparameters, or training configurations. |