Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Visualizing the Implicit Model Selection Tradeoff
Authors: Zezhen He, Yaron Shaposhnik
JAIR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using various datasets and a simple Python interface, we demonstrate how practitioners and researchers could benefit from applying these approaches to better understand the broader impact of their model selection choices. We demonstrate how these methods can be used for various datasets from the UCI ML Repository. We next describe the datasets, training process, classification models, hyperparameters, and the DR methods used in our experiments. |
| Researcher Affiliation | Academia | Zezhen (Dawn) He EMAIL Simon Business School, University of Rochester Rochester, NY 14627. Yaron Shaposhnik EMAIL Simon Business School, University of Rochester Rochester, NY 14627. |
| Pseudocode | No | The paper describes methods in narrative and mathematical forms, but does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We make the code available online to facilitate exploration and adoption (see Appendix E). Appendix E. Python Programming Interface... The code is available at https://github.com/zhesimon/Comparative Meta Models. |
| Open Datasets | Yes | We use datasets from the UCI Machine Learning Repository (Asuncion & Newman, 2007) and FICOβs Explainable Machine Learning Challenge (FICO, 2018). The description of the specific datasets used in this paper can be found in Appendix B. |
| Dataset Splits | Yes | We apply a standard model training and evaluation process of randomly partitioning each of the datasets into a 80% training set and a 20% test set. We evaluate the training error using five-fold CV on the training set and evaluate the test error on the test set. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU/CPU models or processor types. |
| Software Dependencies | No | The paper mentions using a 'Python interface' and references 'Scikit-learn: Machine learning in Python (Pedregosa et al., 2011)', but does not provide specific version numbers for Python, Scikit-learn, or any other key software dependencies. |
| Experiment Setup | Yes | We apply hypterparameter tuning using five-fold CV to determine the configuration with the best prediction accuracy for each model. We tune the typical hyperparameters of each model. The specific values used as hyperparameters for each model are described in Appendix C. Appendix C. Tuning Parameters for Section 4. Appendix D. Tuning Parameters for the Section 8 Case study. |