A Path to Simpler Models Starts With Noise

Authors: Lesia Semenova, Harry Chen, Ronald Parr, Cynthia Rudin

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate our point, we computed the Rashomon ratio and pattern Rashomon ratio for 19 different datasets for hypothesis spaces of decision trees and linear models of different complexity (see Figure 2).Additionally, we introduce a measure called pattern diversity, which captures the average difference in predictions between distinct classification patterns in the Rashomon set, and motivate why it tends to increase with label noise. Our results explain a key aspect of why simpler models often tend to perform as well as black box models on complex, noisier datasets.
Researcher Affiliation Academia Lesia Semenova Harry Chen Ronald Parr Cynthia Rudin Department of Computer Science, Duke University {lesia.semenova,harry.chen084,ronald.parr,cynthia.rudin}@duke.edu
Pseudocode Yes Algorithm 1 Branch and bound approach to find the pattern Rashomon set
Open Source Code No The paper mentions using a third-party tool, Tree FARMS [52], but does not provide access to its own source code for the methodology described.
Open Datasets Yes Table 1: Preprocessed datasets
Dataset Splits Yes For each dataset, we performed five random splits into a train set and a validation set, where the validation set size is 20% of the number of samples. Then we performed 5-fold cross-validation on the training data to choose the best depth for CART.
Hardware Specification No We performed experiments on Duke University s Computer Science Department cluster.
Software Dependencies No To compute the numerator of the Rashomon ratio, we used Tree FARMS [52]. No version number is provided for Tree FARMS or any other software dependencies.
Experiment Setup Yes For the tree depth of CART, we considered the values d {1, . . . , m}, where m is the number of features for a given dataset. We considered six different noise levels, ρ {0, 0.03, 0.05, 0.10, 0.15, 0.20.0.25}. For every level, we performed 25 draws of Sρ. Then we performed 5-fold cross-validation on the training data to choose the best depth for CART.