Outlier-Robust Sparse Estimation via Non-Convex Optimization

Authors: Yu Cheng, Ilias Diakonikolas, Rong Ge, Shivam Gupta, Daniel Kane, Mahdi Soltanolkotabi

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform an experimental evaluation of our robust sparse mean estimation algorithm on synthetic datasets with a focus on statistical accuracy (ℓ2-distance between the output and the true sparse mean). We evaluate our algorithm (Sparse Gradient Descent, Sparse GD) on different noise models, and compare it to the following previous algorithms: ... Our experimental results are summarized in Figures 1, 2 and 3.
Researcher Affiliation Academia Yu Cheng Brown University Providence, RI 02912 yu_cheng@brown.edu; Ilias Diakonikolas University of Wisconsin-Madison Madison, WI 53706 ilias@cs.wisc.edu; Rong Ge Duke University Durham, NC 27708 rongge@cs.duke.edu; Shivam Gupta University of Texas at Austin Austin, TX 78712 shivamgupta@utexas.edu; Daniel M. Kane University of California, San Diego La Jolla, CA 92093 dakane@cs.ucsd.edu; Mahdi Soltanolkotabi University of Southern California Los Angeles, CA 90089 soltanol@usc.edu
Pseudocode Yes Algorithm 1: Robust sparse mean estimation. Input: k > 0, 0 < ϵ < ϵ0, and an ϵ-corrupted set of samples (Xi)n i=1 drawn from a distribution with k-sparse mean µ. Output: a vector bµ that is close to µ. 1: Find a first-order stationary point w n,ϵ of the objective minw f(w) = Σw I F,k,k. 2: Return bµ = (µw)Q where Q is a set of k entries of µw with largest magnitude.
Open Source Code Yes An implementation of our algorithms is available at https://github.com/guptashvm/Sparse-GD.
Open Datasets No The paper uses 'synthetic datasets' but does not provide concrete access information (specific link, DOI, repository name, formal citation with authors/year, or reference to established benchmark datasets) for them.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning.
Hardware Specification Yes We ran our experiments on a computer with a 1.6 GHz Intel Core i5 processor and 8 GB RAM.
Software Dependencies No The paper mentions building on an existing codebase but does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes We evaluate the algorithms on various noise models: Linear-hiding noise. The inliers are drawn from N(0, I). Let S be a size k set. Then, half the outliers are drawn from N(1S, I) and the other half are drawn from N(0, 2I IS). Tail-flipping noise. This noise model picks a k-sparse direction v and replaces the ϵ fraction of points farthest in the v direction with points in the +v direction. Constant-bias noise. This model adds a constant to every coordinate of the outlier points. In Figure 3, we add 2 to every coordinate of every outlier point.