Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Statistical Comparisons of Classifiers by Generalized Stochastic Dominance

Authors: Christoph Jansen, Malte Nalenz, Georg Schollmeyer, Thomas Augustin

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate and investigate our framework in a simulation study and with a set of standard benchmark data sets. ... 5. A Simulation Study ... 6. Experiments with UCI Data Sets
Researcher Affiliation Academia Christoph Jansen EMAIL Department of Statistics Ludwig-Maximilians-Universit at Ludwigstr. 33, 80539 Munich, Germany Malte Nalenz EMAIL Department of Statistics Ludwig-Maximilians-Universit at Ludwigstr. 33, 80539 Munich, Germany Georg Schollmeyer EMAIL Department of Statistics Ludwig-Maximilians-Universit at Ludwigstr. 33, 80539 Munich, Germany Thomas Augustin EMAIL Department of Statistics Ludwig-Maximilians-Universit at Ludwigstr. 33, 80539 Munich, Germany
Pseudocode No The paper describes a "concrete procedure for evaluating the distribution of optij has the following five steps" in Section 4.2. While it lists steps, it is presented in paragraph form rather than a clearly labeled pseudocode block or algorithm environment.
Open Source Code No The paper mentions 'for an implementation of the framework, see Calvo and Santaf e (2016)' which refers to an implementation by other authors, not the source code for the methodology described in this paper by the current authors. There is no explicit statement or link provided for the authors' own code.
Open Datasets Yes All data sets are taken from the UCI machine learning repository (Dua and Graff, 2017).
Dataset Splits Yes On each data set, 10-fold cross-validation is performed, and results are averaged for each criterion and classifier separately.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions several R packages and their corresponding citations: glmnet (Friedman et al., 2010), gbm (Greenwell et al., 2020), randomForest (Liaw and Wiener, 2002), and rpart (Therneau and Atkinson, 2019). It also cites 'R Core Team, 2021' for R itself. However, it does not provide specific version numbers for these R packages or the R environment itself (e.g., R version 4.x.x, glmnet version X.Y).
Experiment Setup Yes The optimal λ is determined via cross-validation. The mixing parameter in Elastic Net is set to 0.5. GBM and Gradient boosted decision stumps are fit using the gbm R-package (Greenwell et al., 2020). Gradient boosting uses 300 trees with a learning rate of 0.02 and a maximum depth of 3. The stumps use 500 trees and a learning rate of 0.05. Random Forest is fit using the randomForest R-package (Liaw and Wiener, 2002) with default settings. For CART we use the rpart R-package (Therneau and Atkinson, 2019) with default settings.