Statistical Multicriteria Benchmarking via the GSD-Front

Authors: Christoph Jansen, Georg Schollmeyer, Julian Rodemann, Hannah Blocher, Thomas Augustin

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate our concepts on the benchmark suite PMLB and on the platform Open ML.
Researcher Affiliation Academia Christoph Jansen1, c.jansen@lancaster.ac.uk Georg Schollmeyer2, georg.schollmeyer@stat.uni-muenchen.de Julian Rodemann2, julian@stat.uni-muenchen.de Hannah Blocher2, hannah.blocher@stat.uni-muenchen.de Thomas Augustin2 thomas. Augustin@stat.uni-muenchen.de 1School of Computing & Communications Lancaster University Leipzig Leipzig, Germany 2Department of Statistics Ludwig-Maximilians-Universität München Munich, Germany
Pseudocode No The paper describes testing schemes with numbered steps but does not include formal pseudocode blocks or sections explicitly labeled 'Algorithm' or 'Pseudocode'.
Open Source Code Yes 4Implementations of all methods and scripts to reproduce the experiments:https://github.com/ hannahblo/Statistical-Multicriteria-Benchmarking-via-the-GSD-Front.
Open Datasets Yes We illustrate our concepts on two well-established benchmark suites: Open ML [82, 11] and PMLB [64].
Dataset Splits Yes We then tune the six classifiers hyperparameters on a (multivariate) grid of size 10 following [49] for each of the 62 datasets and eventually compute i) to iii) through 10-fold cross validation.
Hardware Specification No The paper does not provide specific details on the hardware used, such as GPU/CPU models, memory, or specific computing environments.
Software Dependencies No The paper lists several software libraries and references associated publication years (e.g., 'Package xgboost . [Accessed: 13.05.2024]. 2023.'), but does not provide specific version numbers (e.g., 'xgboost 1.7.0') for the software dependencies used in the experiments.
Experiment Setup Yes We select 80 binary classification datasets (according to criteria detailed in Appendix C.1) from Open ML [82] to compare the performance of Support Vector Machine (SVM) with Random Forest (RF), Decision Tree (CART), Logistic Regression (LR), Generalized Linear Model with Elastic net (GLMNet), Extreme Gradient Boosting (x GBoost), and k-Nearest Neighbors (k NN). Our multidimensional quality metric is composed of predictive accuracy, computation time on the test data, and computation time on the training data. ... We then tune the six classifiers hyperparameters on a (multivariate) grid of size 10 following [49] for each of the 62 datasets and eventually compute i) to iii) through 10-fold cross validation.