Efficient Sampling for Learning Sparse Additive Models in High Dimensions
Authors: Hemant Tyagi, Bernd Gärtner, Andreas Krause
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Simulation results. We now provide simulation results on synthetic data to support our theoretical findings. We consider the noisy setting with the point queries being corrupted with Gaussian noise. For d = 1000, k = 4 and S = {2, 105, 424, 782}, consider f : Rd R where f = φ2(x2) + φ105(x105) + φ424(x424) + φ782(x782) with: φ2(x) = sin(πx), φ105(x) = exp( 2x), φ424(x) = (1/3) cos3(πx) + 0.8x2, φ782(x) = 0.5x4 x2 + 0.8x. We choose δ = 0.3, D = 0.2 which can be verified as valid parameters for the above φl s. Furthermore, we choose mx = 2/δ = 7 and mv = 2k log d = 56 to satisfy the conditions of Theorem 4. Next, we choose constants C = 0.2, B2 = 35 and κ = 0.95 D2 16C2k B2 = 4.24 10 4 as required by Theorem 4. For the choice 2Ck B2 = 0.0267, we then query f at (2mx + 1)(mv + 1) = 855 points. The function values are corrupted with Gaussian noise: N(0, σ2/N) for σ = 0.01 and N = 100. This is equivalent to resampling and averaging the points queries N times. Importantly the sufficient condition on N, as stated in Theorem 4 is σ2 κp ) = 6974 for p = 0.1. Thus we consider a significantly undersampled regime. Lastly we select the threshold τ = mv 2κ = 0.2875 as stated by Theorem 4, and employ Algorithm 1 for different values of the smoothing parameter γ. Over 10 independent runs of the algorithm we observed that S was recovered exactly each time. |
| Researcher Affiliation | Academia | Hemant Tyagi ETH Z urich htyagi@inf.ethz.ch Andreas Krause ETH Z urich krausea@ethz.ch Bernd G artner ETH Z urich gaertner@inf.ethz.ch |
| Pseudocode | Yes | Algorithm 1 Algorithm for learning φl in the SPAM: f(x) = P 1: Choose mx, mv and construct sampling sets X and V as in (3.2), (3.3). 2: Choose step size ϵ > 0. Query f at f(ξi),f(ξi+ϵvj) for i = mx, . . . , mx and j = 1, . . . , mv. 3: Construct yi where yi,j = f(ξi+ϵvj) f(ξi) ϵ for i = mx, . . . , mx and j = 1, . . . , mv. 4: Set bxi := argmin yi=Vz z 1. For τ > 0 compute b S = mx i= mx {l {1, . . . , d} : |bxi,l| > τ}. 5: For each l b S, run (P) as defined in Section 4 using (bxi,l)mx i= mx, τ and some smoothing parameter γ 0, to obtain φ l. 6: For each l b S, set φest,l to be the piece-wise integral of φ l as explained in Section 4. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or include a link to a code repository. |
| Open Datasets | No | The paper uses synthetic data generated according to specified functions and noise models, but does not provide access information for a publicly available or open dataset. The data is generated programmatically within the described setup. |
| Dataset Splits | No | The paper uses synthetic data and focuses on recovery accuracy, but does not explicitly mention training, validation, or test dataset splits in the context of machine learning model training. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names). |
| Experiment Setup | Yes | For d = 1000, k = 4 and S = {2, 105, 424, 782}, consider f : Rd R where f = φ2(x2) + φ105(x105) + φ424(x424) + φ782(x782) with: φ2(x) = sin(πx), φ105(x) = exp( 2x), φ424(x) = (1/3) cos3(πx) + 0.8x2, φ782(x) = 0.5x4 x2 + 0.8x. We choose δ = 0.3, D = 0.2 which can be verified as valid parameters for the above φl s. Furthermore, we choose mx = 2/δ = 7 and mv = 2k log d = 56 to satisfy the conditions of Theorem 4. Next, we choose constants C = 0.2, B2 = 35 and κ = 0.95 D2 16C2k B2 = 4.24 10 4 as required by Theorem 4. For the choice 2Ck B2 = 0.0267, we then query f at (2mx + 1)(mv + 1) = 855 points. The function values are corrupted with Gaussian noise: N(0, σ2/N) for σ = 0.01 and N = 100. This is equivalent to resampling and averaging the points queries N times. Importantly the sufficient condition on N, as stated in Theorem 4 is σ2 κp ) = 6974 for p = 0.1. Thus we consider a significantly undersampled regime. Lastly we select the threshold τ = mv 2κ = 0.2875 as stated by Theorem 4, and employ Algorithm 1 for different values of the smoothing parameter γ. |