Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Stability of Random Forests and Coverage of Random-Forest Prediction Intervals
Authors: Yan Wang, Huaiqing Wu, Dan Nettleton
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that stability may persist even beyond our assumption and hold for heavy-tailed Y 2. and Numerically show that RF stability may hold beyond the above light-tail assumption; |
| Researcher Affiliation | Academia | Yan Wang Department of Mathematics Wayne State University Detroit, MI 48202 EMAIL Huaiqing Wu, Dan Nettleton Department of Statistics Iowa State University Ames, IA 50011 EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. It mentions using the "randomForest" package in R, which is a third-party tool, but does not provide its own implementation code. |
| Open Datasets | No | We created a virtual dataset consisting of n 4000 points. We let Y be a standard Cauchy random variable, which is even without a well-defined mean. The feature vector X P R3 is determined as X r0.5Y sinp Y q, Y 2 0.2Y 3, It Y Δ 0u ΞΆs T where ΞΆ is a standard normal random variable. The paper describes how the data was generated but does not provide access information (link, DOI, repository) for this virtual dataset. |
| Dataset Splits | Yes | We used 3000 of the points for training and 1000 of them as test points. |
| Hardware Specification | No | The paper only states: 'The computation can be done within a few minutes on a laptop.' This is not specific enough to identify the hardware. |
| Software Dependencies | No | The paper mentions using the "random Forest package in R" but does not provide specific version numbers for the software dependencies. |
| Experiment Setup | Yes | Using the random Forest package with default setting (except letting B 1000), we had an output RF predictor RFB. |