Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Flexible High-Dimensional Classification Machines and Their Asymptotic Properties

Authors: Xingye Qiao, Lingsong Zhang

JMLR 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Simulations and real data applications are investigated to illustrate theoretical ﬁndings. Keywords: classiﬁcation, Fisher consistency, high-dimensional low-sample size asymptotics, imbalanced data, support vector machine. Section 6 demonstrates its properties using simulation experiments. A real application study is conducted in Section 7.
Researcher Affiliation	Academia	Xingye Qiao EMAIL Department of Mathematical Sciences Binghamton University State University of New York Binghamton, NY 13902-6000, USA. Lingsong Zhang EMAIL Department of Statistics Purdue University West Lafayette, IN 47907, USA.
Pseudocode	Yes	Algorithm 1 (Adaptive parameter) 1. Initiate θ0 = 0. 2. For k = 0, 1, , (a) Solve FLAME solutions ω(θk) and β(θk) given parameter θk. (b) Let θk+1 = max θk, n g(n+)(θk) C o 1 , where gj(θk) is the functional margin uj yj(x T j ω(θk) + β(θk)) of the jth vector in the negative/majority class and g(l)(θk) is the lth order statistic of these functional margins. 3. When θk = θk 1, the iteration stops.
Open Source Code	Yes	A MATLAB routine has been implemented and is available at the authors personal websites. See Online Appendix 1 for more details on the implementation.
Open Datasets	Yes	In this section we demonstrate the performance of FLAME on a real example: the Human Lung Carcinomas Microarray Data set, which has been analyzed earlier in Bhattacharjee et al. (2001).
Dataset Splits	Yes	We conduct ﬁve-fold cross-validations (CV) to evaluate the within-group error for the two classes over 100 random splits. In each split, we apply FLAME with 21 diﬀerent θ values, ranging from 0, 0.05, 0.1, . . . to 1.
Hardware Specification	No	The paper describes simulation experiments and a real data application but does not specify any particular hardware used for running the experiments.
Software Dependencies	No	A MATLAB routine has been implemented and is available at the authors personal websites. See Online Appendix 1 for more details on the implementation. No specific version of MATLAB or any other software dependencies are mentioned.
Experiment Setup	Yes	In this simulation setting, data are from multivariate normal distributions with identity covariance matrix MV Nd(µ , Id), where d = 100, 400, 700 and 1000. We let µ0 = c(d, d 1, d 2, , 1)T where c > 0 is a constant which scales µ0 to have norm 2.7. Then we let µ+ = µ0 and µ = µ0. The imbalance factor varies among 1, 4 and 9 while the total sample size is 240. For each experiment, we repeat the simulation 50 times... We conduct ﬁve-fold cross-validations (CV) to evaluate the within-group error for the two classes over 100 random splits. In each split, we apply FLAME with 21 diﬀerent θ values, ranging from 0, 0.05, 0.1, . . . to 1.