Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Decorrelated Variable Importance
Authors: Isabella Verdinelli, Larry Wasserman
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Section 4 contains some simulation studies. ... The results from 100 simulations are summarized in Figures 2 and 3 and in Table 2. The standard error of the coverage is 0.03. Figure 2 shows how often the confidence interval contains the target parameter ψ0 as a function of the correlation which varies from 0 to 1. |
| Researcher Affiliation | Academia | Isabella Verdinelli EMAIL Department of Statistics Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA 15213, USA. Larry Wasserman EMAIL Department of Statistics Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA 15213, USA. |
| Pseudocode | No | No explicit pseudocode or algorithm blocks are provided. The methodology is described through mathematical equations and textual explanations. |
| Open Source Code | No | No concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) is provided for the methodology described in this paper. |
| Open Datasets | No | In this section, we compare the behavior of the different parameters in some synthetic examples. ... Example 1. We start with a very simple scenario where Y = 2X + ϵ, ϵ N(0, 1), Z1 = δX + ξ, ξ N(0, 1), and (Z2, . . . , Z5) N(0, I). ... Examples 2-5. Now we consider four multivariate examples. In each case, n = 10, 000, h = 5 and ϵ N(0, 1). |
| Dataset Splits | No | No specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning is provided. The paper describes generating synthetic data for its examples. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | For the additive models we use the R package mgcv. For random forests we use the R package grf. |
| Experiment Setup | Yes | For the additive models we use the R package mgcv. For random forests we use the R package grf. We always use the default settings making no attempt to tune the methods to achieve good coverage. ... In each case, n = 10, 000, h = 5 and ϵ N(0, 1). |