Safe Adaptive Importance Sampling

Authors: Sebastian U. Stich, Anant Raj, Martin Jaggi

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we evaluate the empirical performance of our proposed adaptive sampling scheme on relevant machine learning tasks. In particular, we illustrate performance on generalized linear models with L1 and L2 regularization, as of the form (5),
Researcher Affiliation Academia Sebastian U. Stich EPFL sebastian.stich@epfl.ch Anant Raj Max Planck Institute for Intelligent Systems anant.raj@tuebingen.mpg.de Martin Jaggi EPFL martin.jaggi@epfl.ch
Pseudocode Yes Algorithm 1 Optimal sampling (compute full gradient) Compute f(xk) (define optimal sampling) Define (p k, α k) as in Example 2.3 ik p k xk+1 := xk α k [p k]ik ikf(xk) Algorithm 2 Proposed safe sampling (update l.and u.-bounds) Update ℓ, u (compute safe sampling) Define (ˆpk, ˆαk) as in (7) ik ˆpk Compute ikf(xk) xk+1 := xk ˆαk [ˆpk]ik ikf(xk) Algorithm 3 Fixed sampling (define fixed sampling) Define (p L, α) as in Example 2.2 ik p L Compute ikf(xk) xk+1 := xk α [p L]ik ikf(xk) Algorithm 4 Computing the Safe Sampling for Gradient Information ℓ, u 1: Input: 0n ℓ u, L, Initialize: c = 0n, u = 1, ℓ= n, D = . 2: ℓsort := sort_asc( L 1ℓ), usort := sort_asc( L 1u), m = max(ℓsort) 3: while u ℓdo 4: if [ℓsort]ℓ> m then (largest undecided lower bound is violated) 5: Set corresponding [c]index := [ Lℓsort]ℓ; ℓ:= ℓ 1; D := D {index} 6: else if [usort]u < m then (smallest undecided upper bound is violated) 7: Set corresponding [c]index := [ Lusort]u; u := u + 1; D := D {index} 8: else 9: break (no constraints are violated) 10: end if 11: m := c 2 2 Lc 1 1 (update m as in (9)) 12: end while 13: Set [c]i := Lim for all i / D and Return c, p = Lc 2 1 c 2 2
Open Source Code No The paper does not provide an explicit statement about releasing source code for the methodology or a link to a code repository.
Open Datasets Yes The datasets used in the evaluation are rcv1, real-sim and news20.5 All data are available at www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
Dataset Splits No The paper mentions subsets of data being used (e.g., 'By real-sim and rcv1 we denote a subset of the data chosen by randomly selecting 10,000 features and 10,000 datapoints.'), but does not specify explicit training, validation, or test dataset splits or cross-validation setup.
Hardware Specification No The time units in Figures 6 and 9 are not directly comparable, as the experiments were conducted on different machines.
Software Dependencies No The paper does not provide specific software dependencies or versions (e.g., libraries, frameworks, or programming language versions) used for the experiments.
Experiment Setup No The paper mentions a regularization parameter 'λ = 0.1 is used for all experiments' and discusses stepsize strategies, but it does not provide comprehensive experimental setup details such as specific learning rates, batch sizes, optimizer settings, or model initialization for the models trained.