A Path to Simpler Models Starts With Noise
Authors: Lesia Semenova, Harry Chen, Ronald Parr, Cynthia Rudin
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate our point, we computed the Rashomon ratio and pattern Rashomon ratio for 19 different datasets for hypothesis spaces of decision trees and linear models of different complexity (see Figure 2).Additionally, we introduce a measure called pattern diversity, which captures the average difference in predictions between distinct classification patterns in the Rashomon set, and motivate why it tends to increase with label noise. Our results explain a key aspect of why simpler models often tend to perform as well as black box models on complex, noisier datasets. |
| Researcher Affiliation | Academia | Lesia Semenova Harry Chen Ronald Parr Cynthia Rudin Department of Computer Science, Duke University {lesia.semenova,harry.chen084,ronald.parr,cynthia.rudin}@duke.edu |
| Pseudocode | Yes | Algorithm 1 Branch and bound approach to find the pattern Rashomon set |
| Open Source Code | No | The paper mentions using a third-party tool, Tree FARMS [52], but does not provide access to its own source code for the methodology described. |
| Open Datasets | Yes | Table 1: Preprocessed datasets |
| Dataset Splits | Yes | For each dataset, we performed five random splits into a train set and a validation set, where the validation set size is 20% of the number of samples. Then we performed 5-fold cross-validation on the training data to choose the best depth for CART. |
| Hardware Specification | No | We performed experiments on Duke University s Computer Science Department cluster. |
| Software Dependencies | No | To compute the numerator of the Rashomon ratio, we used Tree FARMS [52]. No version number is provided for Tree FARMS or any other software dependencies. |
| Experiment Setup | Yes | For the tree depth of CART, we considered the values d {1, . . . , m}, where m is the number of features for a given dataset. We considered six different noise levels, ρ {0, 0.03, 0.05, 0.10, 0.15, 0.20.0.25}. For every level, we performed 25 draws of Sρ. Then we performed 5-fold cross-validation on the training data to choose the best depth for CART. |