Consistent Robust Regression

Authors: Kush Bhatia, Prateek Jain, Parameswaran Kamalaruban, Purushottam Kar

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments were carried out on synthetically generated linear regression datasets with corruptions. All implementations were done in Matlab and were run on a single core 2.4GHz machine with 8GB RAM. The experiments establish the following: 1) CRR gives consistent estimates of the regression model, especially in situations with a large number of corruptions where the ordinary least squares estimator fails catastrophically, 2) CRR scales better to large datasets than the TORRENT-FC algorithm of [3] (upto 5 faster) and the Extended Lasso algorithm of [17] (upto 20 faster).
Researcher Affiliation Collaboration Kush Bhatia University of California, Berkeley kushbhatia@berkeley.edu Prateek Jain Microsoft Research, India prajain@microsoft.com Parameswaran Kamalaruban EPFL, Switzerland kamalaruban.parameswaran@epfl.ch Purushottam Kar Indian Institute of Technology, Kanpur purushot@cse.iitk.ac.in
Pseudocode Yes Algorithm 1 CRR: Consistent Robust Regression
Open Source Code No The paper does not provide any links to open-source code repositories or explicitly state that the source code for their methodology is publicly available.
Open Datasets No The paper uses synthetically generated data and does not provide access information (link, citation, or repository) for a publicly available or open dataset.
Dataset Splits No The paper uses synthetically generated data but does not specify any training, validation, or test dataset splits (e.g., percentages, sample counts, or defined methodologies for partitioning the data).
Hardware Specification Yes All implementations were done in Matlab and were run on a single core 2.4GHz machine with 8GB RAM.
Software Dependencies No The paper states 'All implementations were done in Matlab' but does not specify a version number for Matlab or any other software dependencies with version numbers.
Experiment Setup Yes Data: The model w Rd was chosen to be a random unit norm vector. The data was generated as xi N(0, Id). The k responses to be corrupted were chosen uniformly at random and the value of the corruptions was sets as b i Unif (10, 20). Responses were then generated as yi = xi, w + ηi + b i where ηi N(0, σ2). All reported results were averaged over 20 randomly trials. Evaluation Metric: We measure the performance of various algorithms using the standard L2 error: r bw = bw w 2. For the timing experiments, we deemed an algorithm to converge on an instance if it obtained a model wt such that wt wt 1 2 10 4.