Imbalanced Mixed Linear Regression

Authors: Pini Zilber, Boaz Nadler

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, beyond imbalanced mixtures, Mix-IRLS succeeds in a broad range of additional settings where other methods fail, including small sample sizes, presence of outliers, and an unknown number of models K. Furthermore, Mix-IRLS outperforms competing methods on several real-world datasets, in some cases by a large margin. We present simulation results on synthetic data in this section, and on several real-world datasets in the next one. We compare the performance of Mix-IRLS to the following algorithms: (i) Alt Min alternating minimization [52, 53]; (ii) EM expectation maximization [5, Chapter 14], [19]; and (iii) GD gradient descent on a factorized objective [56].
Researcher Affiliation Academia Pini Zilber Boaz Nadler Faculty of Mathematics and Computer Science Weizmann Institute of Science, Israel {pini.zilber, boaz.nadler}@weizmann.ac.il
Pseudocode Yes Algorithm 1: Mix-IRLS: main phase
Open Source Code Yes MATLAB and Python code implementations of Mix-IRLS are available at github.com/pizilber/MLR.
Open Datasets Yes We begin by analyzing the classical music perception dataset of Cohen [13]... Next, we compare the performance of Mix-IRLS to the algorithms listed in the previous section on a more challenging problem: the CO2 emission by vehicles in Canada dataset, available on Kaggle... Finally, we compare the performance of the algorithms on four of the most popular benchmark datasets for linear regression, all of which are available on Kaggle
Dataset Splits No The paper mentions "All methods were given the same random initialization" and "For each simulation, we performed 50 independent realizations". It mentions using synthetic and real-world datasets but does not provide specific train/validation/test dataset splits (e.g., "80/10/10 split", "70% training, 15% validation, 15% test", or absolute sample counts for splits).
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions "We implemented all methods in MATLAB" and "MATLAB and Python code implementations of Mix-IRLS are available at github.com/pizilber/MLR", but it does not specify exact version numbers for MATLAB, Python, or any other software libraries or dependencies.
Experiment Setup Yes Mix-IRLS depends on four parameters: a model fit threshold wth (0, 1), an oversampling ration ρ 1, a tuning parameter η and the number of IRLS iterations T1 1. The values used in our experiments are specified in Appendix G. (Appendix G states: "We set η = 1, T1 = 30 and ρ = 1.25. The initial value for wth was set to 0.1.")