A Residual Bootstrap for High-Dimensional Regression with Near Low-Rank Designs
Authors: Miles Lopes
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In four different settings of n, p, and the decay parameter η, we compared the nominal 90% confidence intervals (CIs) of four methods: oracle , ridge , normal , and OLS , to be described below. In each setting, we generated N1 := 100 random designs X with i.i.d. rows drawn from N(0, Σ), where λj(Σ) = j η, j = 1, . . . , p, and the eigenvectors of Σ were drawn randomly by setting them to be the Q factor in a QR decomposition of a standard p p Gaussian matrix. Then, for each realization of X, we generated N2 := 1000 realizations of Y according to the model (1), where β = 1/ 1 2 Rp, and F0 is the centered t distribution on 5 degrees of freedom, rescaled to have standard deviation σ = 0.1. and Table 1: Comparison of nominal 90% confidence intervals |
| Researcher Affiliation | Academia | Miles E. Lopes Department of Statistics University of California, Berkeley Berkeley, CA 94720 mlopes@stat.berkeley.edu |
| Pseudocode | Yes | Resampling algorithm. To summarize the discussion above, if B is user-specified number of bootstrap replicates, our proposed method for approximating Ψρ(F0; c) is given below. 1. Select ρ and ϱ, and compute the residuals be(ϱ) = Y X bβϱ. 2. Compute the centered distribution function b Fϱ, putting mass 1/n at each bei(ϱ) e(ϱ). 3. For j = 1, . . . , B: Draw a vector ε Rn of n i.i.d. samples from b Fϱ. Compute zj := c (X X + ρIp p) 1X ε . 4. Return the empirical distribution of z1, . . . , z B. |
| Open Source Code | No | The paper does not provide any statements or links indicating the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper describes a data generation process for simulations ('we generated N1 := 100 random designs X with i.i.d. rows drawn from N(0, Σ)...'), but does not use a publicly available dataset, nor does it provide access information for the generated data. |
| Dataset Splits | Yes | To choose the parameters ρ and ϱ for a given X and Y , we first computed br as the value that optimized the MSPE error of a ridge estimator bβr with respect to 5-fold cross validation; i.e. cross validation was performed for every distinct pair (X, Y ). |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | To choose the parameters ρ and ϱ for a given X and Y , we first computed br as the value that optimized the MSPE error of a ridge estimator bβr with respect to 5-fold cross validation; i.e. cross validation was performed for every distinct pair (X, Y ). We then put ϱ = 5br and ρ = 0.1br, as we found the prefactors 5 and 0.1 to work adequately across various settings. (Optimizing ϱ with respect to MSPE is motivated by Theorems 1, 2, and 3. Also, choosing ρ to be somewhat smaller than ϱ conforms with the constraints on θ and γ in Theorem 4.) |