Online Composite Optimization Between Stochastic and Adversarial Environments

Authors: Yibo Wang, SIJIA CHEN, Wei Jiang, Wenhao Yang, Yuanyu Wan, Lijun Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Additionally, we also conduct empirical studies in Appendix A to verify our theoretical results.
Researcher Affiliation Academia 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 2School of Artificial Intelligence, Nanjing University, Nanjing, China 3School of Software Technology, Zhejiang University, Ningbo, China
Pseudocode Yes Algorithm 1 Optimistic Composite Mirror Descent (Opt CMD)
Open Source Code No While the code is not included in the submission, the complete detailed descriptions of the experiments are provided in Section A.
Open Datasets Yes To verify our theoretical findings, we conduct experiments on the mushroom datasets from the LIBSVM repository [Chang and Lin, 2011]
Dataset Splits No The paper mentions sampled data for each round but does not provide specific training, validation, or test dataset splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments in the main text or Appendix A.
Software Dependencies No The paper mentions using 'LIBSVM' and several algorithmic frameworks (OGD, COMID, Optimistic-OMD, ONS, Prox ONS) but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes Let T denote the number of the total rounds. At each round t [T], the learner receives a sampled data (xt, yt) Rd { 1, 1} with d = 112. Then, the learner plays the decision wt from the ball X with the diameter D = 20, and suffers a composite loss φt(wt; xt, yt) = ft(wt; xt, yt) + λr(wt), where we set the hyper-parameter λ = 0.001. ... All parameters of each method are set according to their theoretical suggestions. For instance, in the general convex case, the learning rate is set as η = ct 1/2 in OGD, and η = c T 1/2 in COMID, ηt = D(c + Vt 1) 1/2 in Optimistic-OMD where c denotes the hyper-parameter selected from {10 3, 10 2, , 104}.