Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View

Authors: Christos Thrampoulidis, Samet Oymak, Mahdi Soltanolkotabi

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present and discuss numerical simulations that corroborate our theoretical findings.
Researcher Affiliation Academia Christos Thrampoulidis UC, Santa Barbara EMAIL Samet Oymak UC, Riverside EMAIL Mahdi Soltanolkotabi University of Southern California EMAIL
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any concrete access to source code for the methodology described.
Open Datasets No The paper describes generating synthetic data based on Gaussian Mixture Model (GMM) and Multinomial Logit Model (MLM) specifications rather than using a publicly available dataset. Therefore, no concrete access information for a public dataset is provided.
Dataset Splits No The paper describes data generation and test error but does not explicitly specify training/validation/test dataset splits with percentages, sample counts, or references to predefined standard splits for reproduction.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment.
Experiment Setup Yes Figures 1 and 2 focus on GMM with k = 9 classes, d = 300 and µi 2 = 15. To model different class prior probabilities, we use the distribution 1 = 2 = 3 = 0.5, 4 = 0.5, 5 = 0.5, 6 = 0.25, 7 = 0.25, 8 = 0.25, 9 = 1 21. We consider three scenarios: (a) orthogonal means, equal prior ( i = 1 9); (b) orthogonal means, different prior; (c) correlated means with pairwise correlation coefficient equal to 0.5 (i.e., µi,µj ( µi 2 µj 2) = 0.5 for i j) and different priors as discussed above. Figure 3 focuses on orthogonal classes with varying number of classes k where µi 2 = 15 and d {50,100,200} with kd n = kγ fixed at kγ = 20 11. Figure 4 provides experiments on MLM with k = 9 orthogonal classes. Unlike GMM, CE achieves the best performance in MLM. In Figure 4 (a), classes have same norms µi 2 = 10, while in Figure 4 (b) we have quadrupled the norms of classes 7,8,9 and doubled the norms of classes 4,5,6.