Coordinate Descent with Bandit Sampling

Authors: Farnood Salehi, Patrick Thiran, Elisa Celis

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test this approach on several standard datasets, using different cost functions (including Lasso, logistic and ridge regression) and for both the adaptive setting (the first approach) and the bandit setting (the second approach). We observe that the bandit coordinate selection approach accelerates the convergence of a variety of CD methods (e.g., Stingy CD [11] for Lasso in Figure 2, dual CD [18] for L1-regularized logistic-regression in Figure 3, and dual CD [13] for ridge-regression in Figure 3).
Researcher Affiliation Academia 1,2,3 School of Computer and Communication Sciences École Polytechnique Fédérale de Lausanne (EPFL)
Pseudocode Yes Algorithm 1 B_max_r
Open Source Code No The paper does not provide an unambiguous statement or a direct link to open-source code for the methodology described. It mentions using 'libsvm' as a source for datasets but not its own implementation code.
Open Datasets Yes The datasets we use are found in [5]; we consider usps, aloi and protein for regression, and w8a and a9a for binary classification (see Table 2 in the supplementary materials for statistics about these datasets).
Dataset Splits No The paper does not explicitly provide details about training/validation/test dataset splits, only mentioning that "test and train errors are comparable" but no specific percentages or sample counts for validation.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. It only implies that computations were performed.
Software Dependencies No The paper mentions "libsvm" but does not provide specific version numbers for this or any other software dependencies crucial for replication.
Experiment Setup Yes In all experiments, λs are chosen such that the test and train errors are comparable, and all update rules belong to H. In addition, in all experiments, E = d/2 in B_max_r and gap_per_epoch.