Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Scalable and Efficient Hypothesis Testing with Random Forests
Authors: Tim Coleman, Wei Peng, Lucas Mentch
JMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Simulations and applications to ecological data, where random forests have recently shown promise, are provided. ... In Section 4, we present simulation studies of the testing procedure for a variety of underlying regression functions, as well as a comparison with two different knockoffstatistics. In Section 5, we apply our procedure to multiple ecological datasets where random forests have been successfully employed in recent applied work. |
| Researcher Affiliation | Academia | Tim Coleman EMAIL Wei Peng EMAIL Lucas Mentch EMAIL Department of Statistics University of Pittsburgh Pittsburgh, PA 15215, USA |
| Pseudocode | Yes | Algorithm 1: Permutation test pseudocode for variable importance |
| Open Source Code | No | The paper mentions using "random Forest package in R (Liaw and Wiener, 2002)" and "ranger package (Wright and Ziegler, 2015)" but these are third-party tools. There is no explicit statement or link indicating that the authors' own implementation code for the methodology described in the paper is made publicly available. |
| Open Datasets | Yes | Model 4 where the true data generating model is a random forest. We utilize a dataset from Coleman et al. (2017) ... Fish Toxicity We simulate X from the UCI fish toxicity data set provided by Cassotti et al. (2015) ... Forest Fires: Cortez and Morais (2007) sought to predict log(1+area) burned by several fires in northern Portugal using covariate information on location, time of year, and local weather characteristics. |
| Dataset Splits | Yes | For each of our simulations, we train random forests using the random Forest package in R (Liaw and Wiener, 2002) using the default mtry parameters. ... In both settings, we draw n = 2000 points from the joint distribution of (X, Y ), subsample sizes of kn = n0.6 95, and build B = 125 trees in each forest. Predictions were made at Nt = 100 test points... For our procedure, we build 125 trees, holdout 90 observations at random for testing... Here we select 15% of the available observations ( 3800 points) uniformly at random to serve as the test set where the hypotheses will be evaluated. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments. It only discusses software, datasets, and experimental setup parameters. |
| Software Dependencies | No | We train random forests using the random Forest package in R (Liaw and Wiener, 2002) using the default mtry parameters. ... The random forests were trained with the ranger package using the default mtry = 4... The paper mentions specific software packages (random Forest package in R, ranger package) but does not provide version numbers for these packages or R itself. |
| Experiment Setup | Yes | For each of our simulations, we train random forests using the random Forest package in R (Liaw and Wiener, 2002) using the default mtry parameters. ... subsample sizes of kn = n0.6 95, and build B = 125 trees in each forest. Predictions were made at Nt = 100 test points... For Models 1 and 2, we focus on a marginal signal to noise ratio, which is controlled by the parameters β and σ. We fix β = 10 across all simulations let σ = 10/j where j takes 9 equally spaced values between 0.005 and 2.25... for Model 3, we let kn = n0.6 46, B = 125, Nt = 100, and vary the β coefficient according to 8 equally spaced values between 0.01 and 2.5 and also for 7 equally spaced values between 5 and 20. In Model 4, we let n = 2000, kn = n0.6, B = 125, Nt = 100, and let σ = e j for 10 values of j equally spaced between 1 and 5. ... The random forests were trained with the ranger package using the default mtry = 4, subsamples of size kn = n0.6, and consisting of B = 250 trees in each. ... using mtry = 12 and kn = n0.6 43, B = 250 trees for the importance test and B = 500 trees for the overall test |