Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Robust Minimax Boosting with Performance Guarantees

Authors: Santiago Mazuelas, Veronica Alvarez

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results corroborate that RMBoost is not only resilient to label noise but can also provide strong classification accuracy. [...] The experiments show that RMBoost can outperform existing methods in the presence of noisy labels and also achieve strong classification accuracies without noise.
Researcher Affiliation Academia 1Basque Center of Applied Mathematics (BCAM) 2Massachusetts Institute of Technology (MIT) 3IKERBASQUE-Basque Foundation for Science EMAIL, EMAIL
Pseudocode Yes Algorithm 1 RMBoost learning algorithm
Open Source Code Yes The code implementing the methods presented and reproducing the experiments can be found at https://github.com/Machine Learning BCAM/RMBoost-Neur IPS-2025. The supplementary materials provide additional details and results in Appendix H, including running times assessments and the results of all the boosting methods in all label noise cases.
Open Datasets Yes We utilize 11 publicly available datasets that have been often use as benchmark for boosting methods: Diabetes, German Numer, Credit, Blood transfusion, Titanic, Raisin, QSAR, Climate, Susy, Higgs, and Forest covertype. These datasets can be found in the UCI repository [41] and in www.kaggle.com. [...] [41] Dheeru Dua and Casey Graff. UCI Machine Learning Repository, 2017.
Dataset Splits Yes The results in Table 1 in the paper as well as Table 3 below are obtained carrying out 100 random and stratified train/test partitions with 10% test samples. [...] Figures 4a and 4b are obtained computing for each noise level the classification error over 500 random stratified partitions with 10% test samples.
Hardware Specification No The text mentions "the absolute running times in all the methods are in the order of seconds in a regular desktop machine" but does not specify any particular hardware components like CPU, GPU, or memory models.
Software Dependencies No Methods Robust Boost, Ada Boost, Logit Boost, Gentle Boost, and LPBoost are implemented using their Matlab codes, methods XGB-Quad and Brown Boost are implemented using the Python libraries XGBoost https://xgboost.readthedocs.io and Brown Boost https://github.com/lapis-zero09/Brown Boost, respectively, and method Robust-GBDT is implemented using the code provided by the authors [38]. While programming languages and libraries are mentioned, specific version numbers are not provided for any of them.
Experiment Setup Yes Input: Training samples {(xi, yi)}n i=1, parameters λ, K [...] In particular, we use simplex-based solvers for linear optimization with tolerances for constraints and dual feasibility of 10^-3, and we take λ = 1/sqrt(n) in all the numerical results.