Agnostic Learning of Mixed Linear Regressions with EM and AM Algorithms
Authors: Avishek Ghosh, Arya Mazumdar
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we consider the more general problem of agnostic learning of mixed linear regression from samples, without such generative models. In particular, we show that the AM and EM algorithms, under standard conditions of separability and good initialization, lead to agnostic learning in mixed linear regression by converging to the population loss minimizers, for suitably defined loss functions. In some sense, this shows the strength of AM and EM algorithms that converges to optimal solutions even in the absence of realizable generative models. |
| Researcher Affiliation | Academia | Avishek Ghosh 1 Arya Mazumdar 2 1Systems and Control Engg. and Centre for Machine Intelligence for Data Sciences, Indian Institute of Technology, Bombay, India. 2Halıcıoğlu Data Science Institute, University of California, San Diego. Correspondence to: Avishek Ghosh <avishek.ghosh38@gmail.com>. |
| Pseudocode | Yes | Algorithm 1 Gradient AM for Mixture of Linear Regressions; Algorithm 2 Gradient EM for Mixture of Linear Regressions |
| Open Source Code | No | The paper does not provide any statement or link indicating that source code for their methodology is open-source or publicly available. |
| Open Datasets | No | The paper refers to theoretical data distributions D and discusses properties like 'xi i.i.d N(0,Id)' for analysis, but it does not specify or provide access information for any publicly available datasets used for empirical training. |
| Dataset Splits | No | The paper mentions 'splitting the n samples {xi, yi}n i=1 into 2T disjoint samples' as a theoretical technique for analysis ('sample splitting is a standard in AM type algorithms'), not as a description of empirical dataset splits for validation or training. It does not provide specific percentages or counts for training, validation, or test sets. |
| Hardware Specification | No | As a theoretical paper focusing on algorithm analysis, it does not describe any specific hardware used for running experiments. |
| Software Dependencies | No | As a theoretical paper focusing on algorithm analysis, it does not describe any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper analyzes theoretical properties of algorithms (e.g., 'step size γ', 'initialization parameter cini') but does not provide specific details about an empirical experimental setup, such as hyperparameter values or training configurations for a practical implementation. |