Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On the Convergence of CART under Sufficient Impurity Decrease Condition
Authors: Rahul Mazumder, Haoyue Wang
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we investigate the convergence rate of CART under a regression setting. First, we establish an upper bound on the prediction error of CART under a sufficient impurity decrease (SID) condition [10] our result improves upon the known result by [10] under a similar assumption. Furthermore, we provide examples that demonstrate the error bound cannot be further improved by more than a constant or a logarithmic factor. Second, we introduce a set of easily verifiable sufficient conditions for the SID condition. |
| Researcher Affiliation | Academia | Rahul Mazumder Sloan School of Management Massachusetts Institute of Technology EMAIL Haoyue Wang Operations Research Center Massachusetts Institute of Technology EMAIL |
| Pseudocode | No | The paper describes the CART algorithm in detail in prose (Section 1.3) but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that source code for the described methodology is publicly available. |
| Open Datasets | No | The paper describes a synthetic dataset for an example: "Example 2.1 (Univariate linear function) Suppose p = 1, and xi are i.i.d. from the uniform distribution on [0, 1]." However, this is not a publicly available dataset with a link, DOI, or formal citation. |
| Dataset Splits | No | The paper uses a synthetic example but does not provide specific training/test/validation dataset splits or cross-validation details for it. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for the simulation presented in Figure 1. |
| Software Dependencies | No | The paper does not provide specific software dependencies or version numbers for the simulation conducted. |
| Experiment Setup | No | For the simulation in Example 2.1, the paper states "the depth d is taken as stated above (11)" which refers to a theoretical choice of `d = log2(n)/(1 log2(1 λ))`. This is a theoretical parameter choice rather than specific experimental setup details like learning rates, batch sizes, or optimizer settings typically found in an experimental setup description. |