Generalized Implicit Follow-The-Regularized-Leader
Authors: Keyi Chen, Francesco Orabona
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct linear prediction experiments on datasets from Lib SVM (Chang & Lin, 2011). We show here experiments on classification tasks using the hinge loss and the logistic loss, and regression tasks with absolute loss. We normalize the datasets and added a constant bias term to the features. Given that in the online learning setting, we do not have the training data and validation data to tune the β, we will plot the averaged loss, 1 t Pt i=1 ℓi(xi), versus different choice of β, that at the same time show the algorithms sensitivity to the hyperparameter β and their best achievable performance. |
| Researcher Affiliation | Academia | 1Boston University, Boston, MA, USA. Correspondence to: Keyi Chen <keyichen@bu.edu>, Francesco Orabona <francesco@orabona.com>. |
| Pseudocode | Yes | Algorithm 1 Generalized Implicit FTRL |
| Open Source Code | No | The paper does not provide any statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We conduct linear prediction experiments on datasets from Lib SVM (Chang & Lin, 2011). |
| Dataset Splits | No | The paper mentions tuning the hyperparameter β and plotting averaged loss, but it does not specify exact training, validation, or test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions "Lib SVM" as a dataset source but does not provide specific version numbers for any software dependencies or libraries used in their implementation (e.g., Python, PyTorch, TensorFlow, etc.). |
| Experiment Setup | Yes | We consider β [10 3, 103] for hinge loss and logistic loss, and β [10 3, 108] for the absolute loss. Each algorithm is run 15 times, we plot the average of the averaged losses and the 95% confidence interval. |