Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Sparse SVM with Hard-Margin Loss: a Newton-Augmented Lagrangian Method in Reduced Dimensions

Authors: Penghe Zhang, Naihua Xiu, Hou-Duo Qi

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive numerical results on both simulated and real data sets demonstrate that the proposed method is fast, produces sparse solution of high accuracy, and can lead to effective reduction on active samples and features when compared with several leading solvers.
Researcher Affiliation	Academia	Penghe Zhang EMAIL Department of Data Science and Artiﬁcial Intelligence The Hong Kong Polytechnic University Hung Hom, Hong Kong; Naihua Xiu EMAIL School of Mathematics and Statistics Beijing Jiaotong University Beijing, China; Hou-Duo Qi EMAIL Department of Applied Mathematics Department of Data Science and Artiﬁcial Intelligence The Hong Kong Polytechnic University Hung Hom, Hong Kong
Pseudocode	Yes	Algorithm 1 (i PAL: inexact Proximal Augmented Lagrangian Method); Algorithm 2 (PGN: Projected Gradient-Newton Method)
Open Source Code	No	The paper does not explicitly provide a link to its own source code repository or an affirmative statement about releasing the code for the methodology described.
Open Datasets	Yes	ID data set Source number of features number of instances all ALLAML feature selection database1 7129 72 ... 1https://jundongl.github.io/scikit-feature/ 2https://archive-beta.ics.uci.edu/datasets 3https://www.openml.org/ 4https://www.refine.bio/
Dataset Splits	Yes	Half of the samples will be chosen as training set, and the rest of the samples are used for testing. We conduct ﬁve-fold cross-validation on all the data sets in Tables 2 and 3.
Hardware Specification	Yes	extensive numerical experiments will be conducted by using Matlab 2022a on a laptop with 32GB memory and Intel CORE i7 2.6 GHz CPU.
Software Dependencies	Yes	extensive numerical experiments will be conducted by using Matlab 2022a
Experiment Setup	Yes	We set c1 = c2 = 0.1, γ = 0.1 min{ ai \|i [m]}, ϵk = λ/k (41) and η is taken as (24). We adopt (w0, ξ0, z0) = 0 as initial point and i PAL will stop if the following criterion holds wk wk 1 + ξk ξk 1 + zk zk 1 / wk + ξk + zk < 10 3. For i PAL, we set λ = 1, ρ = 1, µ = 10 2. As s will inﬂuence the Time and nnz of i PAL, we will set s = 10, 20, 30, 40.