reproducibilityindex.ai

Efficient Sampling for Learning Sparse Additive Models in High Dimensions

Authors: Hemant Tyagi, Bernd Gärtner, Andreas Krause

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Simulation results. We now provide simulation results on synthetic data to support our theoretical ﬁndings. We consider the noisy setting with the point queries being corrupted with Gaussian noise. For d = 1000, k = 4 and S = {2, 105, 424, 782}, consider f : Rd R where f = φ2(x2) + φ105(x105) + φ424(x424) + φ782(x782) with: φ2(x) = sin(πx), φ105(x) = exp( 2x), φ424(x) = (1/3) cos3(πx) + 0.8x2, φ782(x) = 0.5x4 x2 + 0.8x. We choose δ = 0.3, D = 0.2 which can be veriﬁed as valid parameters for the above φl s. Furthermore, we choose mx = 2/δ = 7 and mv = 2k log d = 56 to satisfy the conditions of Theorem 4. Next, we choose constants C = 0.2, B2 = 35 and κ = 0.95 D2 16C2k B2 = 4.24 10 4 as required by Theorem 4. For the choice 2Ck B2 = 0.0267, we then query f at (2mx + 1)(mv + 1) = 855 points. The function values are corrupted with Gaussian noise: N(0, σ2/N) for σ = 0.01 and N = 100. This is equivalent to resampling and averaging the points queries N times. Importantly the sufﬁcient condition on N, as stated in Theorem 4 is σ2 κp ) = 6974 for p = 0.1. Thus we consider a signiﬁcantly undersampled regime. Lastly we select the threshold τ = mv 2κ = 0.2875 as stated by Theorem 4, and employ Algorithm 1 for different values of the smoothing parameter γ. Over 10 independent runs of the algorithm we observed that S was recovered exactly each time.
Researcher Affiliation	Academia	Hemant Tyagi ETH Z urich htyagi@inf.ethz.ch Andreas Krause ETH Z urich krausea@ethz.ch Bernd G artner ETH Z urich gaertner@inf.ethz.ch
Pseudocode	Yes	Algorithm 1 Algorithm for learning φl in the SPAM: f(x) = P 1: Choose mx, mv and construct sampling sets X and V as in (3.2), (3.3). 2: Choose step size ϵ > 0. Query f at f(ξi),f(ξi+ϵvj) for i = mx, . . . , mx and j = 1, . . . , mv. 3: Construct yi where yi,j = f(ξi+ϵvj) f(ξi) ϵ for i = mx, . . . , mx and j = 1, . . . , mv. 4: Set bxi := argmin yi=Vz z 1. For τ > 0 compute b S = mx i= mx {l {1, . . . , d} : \|bxi,l\| > τ}. 5: For each l b S, run (P) as deﬁned in Section 4 using (bxi,l)mx i= mx, τ and some smoothing parameter γ 0, to obtain φ l. 6: For each l b S, set φest,l to be the piece-wise integral of φ l as explained in Section 4.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code or include a link to a code repository.
Open Datasets	No	The paper uses synthetic data generated according to specified functions and noise models, but does not provide access information for a publicly available or open dataset. The data is generated programmatically within the described setup.
Dataset Splits	No	The paper uses synthetic data and focuses on recovery accuracy, but does not explicitly mention training, validation, or test dataset splits in the context of machine learning model training.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names).
Experiment Setup	Yes	For d = 1000, k = 4 and S = {2, 105, 424, 782}, consider f : Rd R where f = φ2(x2) + φ105(x105) + φ424(x424) + φ782(x782) with: φ2(x) = sin(πx), φ105(x) = exp( 2x), φ424(x) = (1/3) cos3(πx) + 0.8x2, φ782(x) = 0.5x4 x2 + 0.8x. We choose δ = 0.3, D = 0.2 which can be veriﬁed as valid parameters for the above φl s. Furthermore, we choose mx = 2/δ = 7 and mv = 2k log d = 56 to satisfy the conditions of Theorem 4. Next, we choose constants C = 0.2, B2 = 35 and κ = 0.95 D2 16C2k B2 = 4.24 10 4 as required by Theorem 4. For the choice 2Ck B2 = 0.0267, we then query f at (2mx + 1)(mv + 1) = 855 points. The function values are corrupted with Gaussian noise: N(0, σ2/N) for σ = 0.01 and N = 100. This is equivalent to resampling and averaging the points queries N times. Importantly the sufﬁcient condition on N, as stated in Theorem 4 is σ2 κp ) = 6974 for p = 0.1. Thus we consider a signiﬁcantly undersampled regime. Lastly we select the threshold τ = mv 2κ = 0.2875 as stated by Theorem 4, and employ Algorithm 1 for different values of the smoothing parameter γ.