Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Path to Simpler Models Starts With Noise

Authors: Lesia Semenova, Harry Chen, Ronald Parr, Cynthia Rudin

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate our point, we computed the Rashomon ratio and pattern Rashomon ratio for 19 different datasets for hypothesis spaces of decision trees and linear models of different complexity (see Figure 2).Additionally, we introduce a measure called pattern diversity, which captures the average difference in predictions between distinct classification patterns in the Rashomon set, and motivate why it tends to increase with label noise. Our results explain a key aspect of why simpler models often tend to perform as well as black box models on complex, noisier datasets.
Researcher Affiliation Academia Lesia Semenova Harry Chen Ronald Parr Cynthia Rudin Department of Computer Science, Duke University EMAIL
Pseudocode Yes Algorithm 1 Branch and bound approach to find the pattern Rashomon set
Open Source Code No The paper mentions using a third-party tool, Tree FARMS [52], but does not provide access to its own source code for the methodology described.
Open Datasets Yes Table 1: Preprocessed datasets
Dataset Splits Yes For each dataset, we performed five random splits into a train set and a validation set, where the validation set size is 20% of the number of samples. Then we performed 5-fold cross-validation on the training data to choose the best depth for CART.
Hardware Specification No We performed experiments on Duke University s Computer Science Department cluster.
Software Dependencies No To compute the numerator of the Rashomon ratio, we used Tree FARMS [52]. No version number is provided for Tree FARMS or any other software dependencies.
Experiment Setup Yes For the tree depth of CART, we considered the values d {1, . . . , m}, where m is the number of features for a given dataset. We considered six different noise levels, ΁ {0, 0.03, 0.05, 0.10, 0.15, 0.20.0.25}. For every level, we performed 25 draws of S΁. Then we performed 5-fold cross-validation on the training data to choose the best depth for CART.