Online Composite Optimization Between Stochastic and Adversarial Environments
Authors: Yibo Wang, SIJIA CHEN, Wei Jiang, Wenhao Yang, Yuanyu Wan, Lijun Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Additionally, we also conduct empirical studies in Appendix A to verify our theoretical results. |
| Researcher Affiliation | Academia | 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 2School of Artificial Intelligence, Nanjing University, Nanjing, China 3School of Software Technology, Zhejiang University, Ningbo, China |
| Pseudocode | Yes | Algorithm 1 Optimistic Composite Mirror Descent (Opt CMD) |
| Open Source Code | No | While the code is not included in the submission, the complete detailed descriptions of the experiments are provided in Section A. |
| Open Datasets | Yes | To verify our theoretical findings, we conduct experiments on the mushroom datasets from the LIBSVM repository [Chang and Lin, 2011] |
| Dataset Splits | No | The paper mentions sampled data for each round but does not provide specific training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments in the main text or Appendix A. |
| Software Dependencies | No | The paper mentions using 'LIBSVM' and several algorithmic frameworks (OGD, COMID, Optimistic-OMD, ONS, Prox ONS) but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | Let T denote the number of the total rounds. At each round t [T], the learner receives a sampled data (xt, yt) Rd { 1, 1} with d = 112. Then, the learner plays the decision wt from the ball X with the diameter D = 20, and suffers a composite loss φt(wt; xt, yt) = ft(wt; xt, yt) + λr(wt), where we set the hyper-parameter λ = 0.001. ... All parameters of each method are set according to their theoretical suggestions. For instance, in the general convex case, the learning rate is set as η = ct 1/2 in OGD, and η = c T 1/2 in COMID, ηt = D(c + Vt 1) 1/2 in Optimistic-OMD where c denotes the hyper-parameter selected from {10 3, 10 2, , 104}. |