Efficient Second-Order Online Kernel Learning with Adaptive Embedding
Authors: Daniele Calandriello, Alessandro Lazaric, Michal Valko
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate PROS-N-KONS on several regression and binary classification problems, showing that it is competitive with state-of-the-art methods. |
| Researcher Affiliation | Academia | Seque L team, INRIA Lille Nord Europe, France {daniele.calandriello, alessandro.lazaric, michal.valko}@inria.fr |
| Pseudocode | Yes | Input: Feasible parameter C, step-sizes ηt, regularizer α 1: Initialize j = 0, ew0 = 0, eg0 = 0, e P0 = 0, e A0 = αI, 2: Start a KORS instance with an empty dictionary I0. 3: for t = {1, . . . , T} do { Dictionary changed, reset.} 4: Receive xt, feed it to KORS. Receive zt (point added to dictionary or not) 5: if zt 1 = 1 then 6: j = j + 1 7: Build Kj from Ij and decompose it in UjΣjΣT j UT j 8: Set e At 1 = αI Rj j. 9: eωt = 0 Rj 10: else {Execute a gradient-descent step.} 11: Compute map φt and approximate map eφt = Σ 1 j UT jΦT jφt Rj. 12: Compute eυt = eωt 1 e A 1 t 1egt 1. 13: Compute eωt = eυt h( e φT t eυt) e φT t e A 1 t 1 e φt e A 1 t 1 eφt where h(z) = sign(z) max{|z| C, 0} 14: end if 15: Predict eyt = eφT t eωt. 16: Observe egt = eωtℓt( eφT t eωt) = ℓ t(eyt) eφt. 17: Update e At = e At 1 + σt 2 egteg T t . 18: end for Figure 1: PROS-N-KONS |
| Open Source Code | No | The paper does not provide any statement or link indicating the release of source code for the described methodology. |
| Open Datasets | Yes | We replicate the experimental setting in [13] with 9 datasets for regression and 3 datasets for binary classification. We use the same preprocessing as Lu et al. [13]: each feature of the points xt is rescaled to fit in [0, 1], for regression the target variable yt is rescaled in [0, 1], while in binary classification the labels are { 1, 1}. |
| Dataset Splits | No | The paper does not provide specific details on how datasets were split into training, validation, and test sets, or whether cross-validation was used. |
| Hardware Specification | Yes | All experiments are run on a single machine with 2 Xeon E5-2630 CPUs for a total of 10 cores, and are averaged over 15 runs. |
| Software Dependencies | No | The paper does not provide specific details on software dependencies, such as library names with version numbers, used for the experiments. |
| Experiment Setup | Yes | For all datasets, we set β = 1 and ε = 0.5 for all PROS-N-KONS variants and Jmax = 100 for B-KONS. For each algorithm and dataset, we report average and standard deviation of the losses. We also do not tune the Gaussian kernel bandwidth, but take the value σ = 8 used by [13]. |