Efficient Projection-Free Online Methods with Stochastic Recursive Gradient
Authors: Jiahao Xie, Zebang Shen, Chao Zhang, Boyu Wang, Hui Qian6446-6453
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate the efficiency of the proposed methods compared to state-of-the-arts. To validate the theoretical results in the previous section, we first conduct numerical experiments on an OCO problem, i.e., online multiclass logistic regression. |
| Researcher Affiliation | Collaboration | Jiahao Xie,1 Zebang Shen, 2 Chao Zhang,1,3 Boyu Wang,4 Hui Qian1 1College of Computer Science and Technology, Zhejiang University 2University of Pennsylvania 3Tencent AI Lab 4University of Western Ontario |
| Pseudocode | Yes | Algorithm 1: ORGFW |
| Open Source Code | Yes | Source code: https://github.com/xjiajiahao/ORGFW |
| Open Datasets | Yes | We use two well-known multiclass datasets: MNIST and CIFAR10. 4http://yann.lecun.com/exdb/mnist/ 5https://www.cs.toronto.edu/~kriz/cifar.html |
| Dataset Splits | No | The paper uses MNIST and CIFAR10 datasets and mentions training, but does not provide specific details on how the data was split into training, validation, and test sets. It only provides total instance counts. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | For all compared methods, we choose hyperparameters via grid search and simply set the initial point to 0. Besides, we repeat the random methods for 6 trails and report the average result. For the MNIST dataset, we set |Bt| = 600 and r = 8. For CIFAR10, we set |Bt| = 500 and r = 32. We set the number of rounds to T = 100 and set the parameter K in MORGFW and Meta-Frank-Wolfe to K = T and K = T 3/2 as suggested by the theory, respectively. For the online methods, diminishing step sizes are used and a mini-batch of 16 data points are revealed to them in each round. For both datasets, we set m = 10. In addition, we set the ℓ1 ball radii rw = rb = 10. |