Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation
Authors: Ke Wang, Vidya Muthukumar, Christos Thrampoulidis
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our numerical results show excellent agreement with our theoretical findings. The main contributions of our paper are theoretical, and our simulations on synthetic are intended to support these results rather than constitute results. |
| Researcher Affiliation | Academia | Ke Wang Department of Statistics and Applied Probability University of California, Santa Barbara Santa Barbara, CA 93106 kewang01@ucsb.edu Vidya Muthukumar School of Electrical and Computer Engineering & Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332 vmuthukumar8@gatech.edu Christos Thrampoulidis Department of Electrical and Computer Engineering University of British Columbia Vancouver, BC Canada V6T 1Z4 cthrampo@ece.ubc.ca |
| Pseudocode | No | The paper describes algorithms using mathematical formulations (e.g., equations 3, 4, 5) but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We include the code that creates the figures in our paper and will submit it as supplementary material. |
| Open Datasets | No | We assume that the data pairs {xi, yi}n i=1 are generated IID. We will consider two models for the distribution of (x, y). For both models, we define the mean vectors {µj}k j=1 2 Rp, and the mean matrix is given by M := [µ1 µ2 µk] 2 Rp k. Gaussian Mixture Model (GMM)... Multinomial Logit Model (MLM)... The paper states in the ethics review that simulations are on 'synthetic' data. No concrete access information for a public dataset is provided. |
| Dataset Splits | No | The paper mentions 'training data' and a 'fresh sample (x, y)' (test data) but does not provide specific details on the dataset splits (e.g., percentages or counts for training, validation, and test sets). |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We set the number of classes k = 4, fix n = 40, and vary p = 50, . . . , 1200 to guarantee sufficient overparameterization. We consider the case of orthogonal and equalnorm mean vectors kµk2 = µpp, with µ = 0.2, 0.3 and 0.4. |