Gradient-based Sampling: An Adaptive Importance Sampling for Least-squares
Authors: Rong Zhu
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Theoretically, we establish an error bound analysis of the general importance sampling with respect to LS solution from full data. The result establishes an improved performance of the use of our gradientbased sampling. Synthetic and real data sets are used to empirically argue that the gradient-based sampling has an obvious advantage over existing sampling methods from two aspects of statistical efficiency and computational saving. |
| Researcher Affiliation | Academia | Rong Zhu Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China. rongzhu@amss.cas.cn |
| Pseudocode | Yes | Algorithm 1 Gradient-based sampling Algorithm |
| Open Source Code | No | The paper does not provide a statement about the release of source code or a link to a code repository. |
| Open Datasets | Yes | Detailed numerical experiments are conducted to compare the excess risk of β based on L2 loss against the expected subsample size r for different synthetic datasets and real data examples. In this section, we report several representative studies. [...] on two UCI datasets: CASP (n = 45730, d = 9) and Online News Popularity (NEWS) (n = 39644, d = 59). |
| Dataset Splits | No | The paper calculates MSE based on subsample estimates for approximating the full sample LS solution, and considers sampling ratios. However, it does not specify explicit training, validation, and test splits in the traditional machine learning sense for model development. |
| Hardware Specification | Yes | We perform the computation by R software in PC with 3 GHz intel i7 processor, 8 GB memory and OS X operation system. |
| Software Dependencies | No | The paper mentions 'R software' but does not specify its version number or any other software dependencies with their versions. |
| Experiment Setup | Yes | We calculate the full sample LS solution ˆβn for each dataset, and repeatedly apply various sampling methods for B = 1000 times to get subsample estimates βb for b = 1, . . . , B. We set d as 100, and n as among 20K, 50K, 100K, 200K, 500K. Two sampling ratio r/n values are considered: 0.01 and 0.05. For GRAD, we set the r0 = r to getting the pilot estimate β0. |