Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time
Authors: Jerry Li, Guanghao Ye
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we demonstrate a novel algorithm which achieves the same statistical guarantees, but which runs in time e O(T(N, d) log κ). In particular our runtime has no dependence on ε. When Σ is reasonably conditioned, our runtime matches that of the fastest algorithm for covariance estimation without outliers, up to poly-logarithmic factors, showing that we can get robustness essentially for free. |
| Researcher Affiliation | Collaboration | Jerry Li Microsoft Research EMAIL Guanghao Ye University of Washington EMAIL |
| Pseudocode | Yes | Algorithm 1 Robust Covariance Estimation |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper uses a generative model for data, stating "we are given samples X1, . . . , XN drawn from a Gaussian N(0, Σ), but an ε-fraction of these points have been arbitrarily corrupted." This describes synthetic data generation for theoretical analysis, not a publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper analyzes algorithms based on statistical guarantees and theoretical proofs rather than empirical evaluation with specific training, validation, or test dataset splits. |
| Hardware Specification | No | The paper discusses algorithmic runtime complexity abstractly (e.g., 'T(N, d) is the time it takes to multiply a d N matrix by its transpose') but does not provide any specific details about hardware used for running any computations or experiments. |
| Software Dependencies | No | The paper mentions using 'black-box nearly-linear time SDP solvers [AZLO15, AZLO16]' but does not provide specific version numbers for these or any other software dependencies required to replicate the work. |
| Experiment Setup | No | The paper is theoretical and describes algorithms and proofs without detailing an experimental setup, hyperparameters, or training configurations. |