reproducibilityindex.ai

Classification of Heavy-tailed Features in High Dimensions: a Superstatistical Approach

Authors: Urte Adomaityte, Gabriele Sicuro, Pierpaolo Vivo

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Fig. 1 we present the results of our numerical experiments using the square loss and small regularisation. An excellent agreement between the theoretical predictions and the results of numerical experiments is found for a range of values of 0 and sample complexity U, both for balanced, i.e., equally sized, and unbalanced clusters of data (the plot for this case can be found in Appendix A.3).
Researcher Affiliation	Academia	Urte Adomaityte Department of Mathematics King s College London urte.adomaityte@kcl.ac.uk Gabriele Sicuro Department of Mathematics King s College London gabriele.sicuro@kcl.ac.uk Pierpaolo Vivo Department of Mathematics King s College London pierpaolo.vivo@kcl.ac.uk
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	No	The paper describes generating synthetic datasets according to specified distributions (e.g., "The synthetic data sets will be produced using, an inverse-Gamma-distributed variance Δ...") rather than using a publicly available dataset with concrete access information.
Dataset Splits	No	The paper discusses sample sizes and dimensionality in the context of high-dimensional limits (e.g., "sample size = and the dimensionality 3 are both sent to inﬁnity, with =/3 U kept constant.") and mentions training and test errors, but does not provide specific dataset split information (percentages, sample counts) for reproducibility of finite-sample experiments.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions software like "scikit-learn" [58] and "SciPy" [73] in its references, but does not specify the version numbers of these or any other ancillary software components used for the experiments.
Experiment Setup	Yes	We compare our theoretical predictions with the results of numerical experiments for a large family of data distributions. The results have been obtained using ridge 2-regularisation, and both quadratic and logistic losses, with various data cluster balances d. We will also assume, without loss of generality, that = 1 p 3 -, where N(0, O3). The synthetic data sets will be produced using, an inverse-Gamma-distributed variance Δ, with density parametrised as r(Δ) r0,2(Δ) = 20 Γ(0)Δ0+1 e 2 depending on the shape parameter 0 > 0 and on the scale parameter 2 > 0. ... square loss and small regularisation. An excellent agreement... with _ = 10 5. (Fig. 1 caption).