Efficient Second-Order Online Kernel Learning with Adaptive Embedding

Authors: Daniele Calandriello, Alessandro Lazaric, Michal Valko

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically validate PROS-N-KONS on several regression and binary classification problems, showing that it is competitive with state-of-the-art methods.
Researcher Affiliation Academia Seque L team, INRIA Lille Nord Europe, France {daniele.calandriello, alessandro.lazaric, michal.valko}@inria.fr
Pseudocode Yes Input: Feasible parameter C, step-sizes ηt, regularizer α 1: Initialize j = 0, ew0 = 0, eg0 = 0, e P0 = 0, e A0 = αI, 2: Start a KORS instance with an empty dictionary I0. 3: for t = {1, . . . , T} do { Dictionary changed, reset.} 4: Receive xt, feed it to KORS. Receive zt (point added to dictionary or not) 5: if zt 1 = 1 then 6: j = j + 1 7: Build Kj from Ij and decompose it in UjΣjΣT j UT j 8: Set e At 1 = αI Rj j. 9: eωt = 0 Rj 10: else {Execute a gradient-descent step.} 11: Compute map φt and approximate map eφt = Σ 1 j UT jΦT jφt Rj. 12: Compute eυt = eωt 1 e A 1 t 1egt 1. 13: Compute eωt = eυt h( e φT t eυt) e φT t e A 1 t 1 e φt e A 1 t 1 eφt where h(z) = sign(z) max{|z| C, 0} 14: end if 15: Predict eyt = eφT t eωt. 16: Observe egt = eωtℓt( eφT t eωt) = ℓ t(eyt) eφt. 17: Update e At = e At 1 + σt 2 egteg T t . 18: end for Figure 1: PROS-N-KONS
Open Source Code No The paper does not provide any statement or link indicating the release of source code for the described methodology.
Open Datasets Yes We replicate the experimental setting in [13] with 9 datasets for regression and 3 datasets for binary classification. We use the same preprocessing as Lu et al. [13]: each feature of the points xt is rescaled to fit in [0, 1], for regression the target variable yt is rescaled in [0, 1], while in binary classification the labels are { 1, 1}.
Dataset Splits No The paper does not provide specific details on how datasets were split into training, validation, and test sets, or whether cross-validation was used.
Hardware Specification Yes All experiments are run on a single machine with 2 Xeon E5-2630 CPUs for a total of 10 cores, and are averaged over 15 runs.
Software Dependencies No The paper does not provide specific details on software dependencies, such as library names with version numbers, used for the experiments.
Experiment Setup Yes For all datasets, we set β = 1 and ε = 0.5 for all PROS-N-KONS variants and Jmax = 100 for B-KONS. For each algorithm and dataset, we report average and standard deviation of the losses. We also do not tune the Gaussian kernel bandwidth, but take the value σ = 8 used by [13].