Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Flexible High-Dimensional Classification Machines and Their Asymptotic Properties
Authors: Xingye Qiao, Lingsong Zhang
JMLR 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Simulations and real data applications are investigated to illustrate theoretical findings. Keywords: classification, Fisher consistency, high-dimensional low-sample size asymptotics, imbalanced data, support vector machine. Section 6 demonstrates its properties using simulation experiments. A real application study is conducted in Section 7. |
| Researcher Affiliation | Academia | Xingye Qiao EMAIL Department of Mathematical Sciences Binghamton University State University of New York Binghamton, NY 13902-6000, USA. Lingsong Zhang EMAIL Department of Statistics Purdue University West Lafayette, IN 47907, USA. |
| Pseudocode | Yes | Algorithm 1 (Adaptive parameter) 1. Initiate θ0 = 0. 2. For k = 0, 1, , (a) Solve FLAME solutions ω(θk) and β(θk) given parameter θk. (b) Let θk+1 = max θk, n g(n+)(θk) C o 1 , where gj(θk) is the functional margin uj yj(x T j ω(θk) + β(θk)) of the jth vector in the negative/majority class and g(l)(θk) is the lth order statistic of these functional margins. 3. When θk = θk 1, the iteration stops. |
| Open Source Code | Yes | A MATLAB routine has been implemented and is available at the authors personal websites. See Online Appendix 1 for more details on the implementation. |
| Open Datasets | Yes | In this section we demonstrate the performance of FLAME on a real example: the Human Lung Carcinomas Microarray Data set, which has been analyzed earlier in Bhattacharjee et al. (2001). |
| Dataset Splits | Yes | We conduct five-fold cross-validations (CV) to evaluate the within-group error for the two classes over 100 random splits. In each split, we apply FLAME with 21 different θ values, ranging from 0, 0.05, 0.1, . . . to 1. |
| Hardware Specification | No | The paper describes simulation experiments and a real data application but does not specify any particular hardware used for running the experiments. |
| Software Dependencies | No | A MATLAB routine has been implemented and is available at the authors personal websites. See Online Appendix 1 for more details on the implementation. No specific version of MATLAB or any other software dependencies are mentioned. |
| Experiment Setup | Yes | In this simulation setting, data are from multivariate normal distributions with identity covariance matrix MV Nd(µ , Id), where d = 100, 400, 700 and 1000. We let µ0 = c(d, d 1, d 2, , 1)T where c > 0 is a constant which scales µ0 to have norm 2.7. Then we let µ+ = µ0 and µ = µ0. The imbalance factor varies among 1, 4 and 9 while the total sample size is 240. For each experiment, we repeat the simulation 50 times... We conduct five-fold cross-validations (CV) to evaluate the within-group error for the two classes over 100 random splits. In each split, we apply FLAME with 21 different θ values, ranging from 0, 0.05, 0.1, . . . to 1. |