reproducibilityindex.ai

Lazy Estimation of Variable Importance for Large Neural Networks

Authors: Yue Gao, Abby Stevens, Garvesh Raskutti, Rebecca Willett

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate through simulations that our method is fast and accurate under several data-generating regimes, and we demonstrate its real-world applicability on a seasonal climate forecasting example.
Researcher Affiliation	Academia	1Department of Statistics, University of Wisconsin, Madison 2Department of Statistics, University of Chicago 3Department of Computer Science, University of Chicago.
Pseudocode	Yes	Algorithm 1 Lazy training for VI
Open Source Code	Yes	Our implementation is available at https://github.com/Willett-Group/lazyvi.
Open Datasets	Yes	We use simulations from the Community Earth System Model-Large Ensemble project (CESM-LENS; Kay et al. (2015); de La Beaujardi ere et al. (2019)).
Dataset Splits	Yes	We estimate hθf and hθ j using n1 < n samples as training data, and use the remaining n2 = n n1 samples to estimate VI. For the lazy training method, which we call Lazy VI, we use the training data to estimate the full model parameters, compute the gradient of the network with respect to each model parameter for each training sample, and then regress these gradients against the difference between Y the dropout estimates from the training data to estimate the parameter correction θj for variable j. We then update the full model parameters using this learned correction to compute the VI estimate and its associated standard errors. See Algorithm 1 for full details. Theorem 4.4 makes the assumption that the ridge parameter λ from Equation (15) is large. Since we are ultimately interested in estimating hθ j and not θj, we evaluate hθf + θj( ) through K-fold CV to choose ˆλj for each variable (Algorithm 2 in Appendix C.2).
Hardware Specification	No	The paper discusses training neural networks but does not provide specific details about the hardware used, such as GPU models, CPU specifications, or memory.
Software Dependencies	No	The paper mentions that its implementation is available on GitHub, implying specific software dependencies, but it does not explicitly list software names with their version numbers within the text.
Experiment Setup	Yes	For these experiments, we train a wide, fully connected two-layer neural network with Re LU activation for all simulations. Unless otherwise specified, the width of the hidden layer in the training network is m = 50.