Estimation of the covariance structure of heavy-tailed distributions
Authors: Xiaohan Wei, Stanislav Minsker
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose and analyze a new estimator of the covariance matrix that admits strong theoretical guarantees under weak assumptions on the underlying distribution, such as existence of moments of only low order. While estimation of covariance matrices corresponding to sub-Gaussian distributions is well-understood, much less in known in the case of heavy-tailed data. As K. Balasubramanian and M. Yuan write 1, data from real-world experiments oftentimes tend to be corrupted with outliers and/or exhibit heavy tails. In such cases, it is not clear that those covariance matrix estimators .. remain optimal and ..what are the other possible strategies to deal with heavy tailed distributions warrant further studies. We make a step towards answering this question and prove tight deviation inequalities for the proposed estimator that depend only on the parameters controlling the intrinsic dimension associated to the covariance matrix (as opposed to the dimension of the ambient space); in particular, our results are applicable in the case of highdimensional observations. |
| Researcher Affiliation | Academia | Stanislav Minsker Department of Mathematics University of Southern California Los Angeles, CA 90007 minsker@usc.edu Xiaohan Wei Department of Electrical Engineering University of Southern California Los Angeles, CA 90007 xiaohanw@usc.edu |
| Pseudocode | No | The paper provides mathematical definitions of functions and estimators (e.g., ψ(x) and bΣ), but these are not structured as pseudocode or an algorithm block with step-by-step instructions. |
| Open Source Code | No | No statement or link indicating the availability of open-source code for the described methodology was found. |
| Open Datasets | No | The paper is theoretical and focuses on statistical estimation and theoretical guarantees. It does not refer to the use of any specific public or open datasets for empirical training or evaluation. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical experiments with dataset splits. No specific dataset split information (training, validation, or test) is provided. |
| Hardware Specification | No | The paper is theoretical and does not describe any computational experiments or the hardware used to perform them. |
| Software Dependencies | No | The paper is theoretical and does not describe any computational experiments that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not detail any empirical experimental setups, hyperparameters, or training configurations. |