Position: Amazing Things Come From Having Many Good Models
Authors: Cynthia Rudin, Chudi Zhong, Lesia Semenova, Margo Seltzer, Ronald Parr, Jiachang Liu, Srikar Katta, Jon Donnelly, Harry Chen, Zachery Boner
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We applied a variety of machine learning methods to the data, including boosted decision trees, random forest, multi-layer perceptrons, support vector machines, logistic regression, and a 2-layer additive risk model. All of these models have completely different functional forms, from linear models to kernel-based nonparametric models with smooth decision boundaries, to tree-based nonparametric models with sharp decision boundaries, yet most of these models perform comparably, as shown in Table 1. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Duke University, Durham, North Carolina, USA 2Department of Computer Science, University of British Columbia, Vancouver, Canada. |
| Pseudocode | No | The paper describes algorithms such as Tree FARMS, GAM Rashomon set, and Faster Risk, but does not include any explicit pseudocode blocks or labeled algorithm sections. |
| Open Source Code | No | The paper discusses and cites previously published algorithms and tools (e.g., Tree FARMS, GAMChanger, Fast Sparse) but does not provide a statement or link for source code from this specific paper. |
| Open Datasets | Yes | Let us work with a dataset the FICO dataset from the Explainable ML Challenge (FICO et al., 2018) though extremely similar results hold for an astounding number of other datasets (Semenova et al., 2022). |
| Dataset Splits | Yes | Table 1. Performance of different machine learning models on the 23-feature FICO dataset (Chen et al., 2022) over 10 test folds. They perform similarly. |
| Hardware Specification | No | The paper is a perspective piece and refers to experiments performed in other works, therefore, it does not specify hardware details for its own content. |
| Software Dependencies | No | The paper refers to various algorithms and tools (e.g., GOSDT, Fast Sparse, Tree FARMS, GAMChanger) but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | No | The paper provides execution times for some models (e.g., 'obtained in 8.1 seconds by the GOSDT algorithm'), but does not explicitly detail hyperparameter values, training configurations, or other specific experimental setup details for its own content. |