reproducibilityindex.ai

Optimal regret algorithm for Pseudo-1d Bandit Convex Optimization

Authors: Aadirupa Saha, Nagarajan Natarajan, Praneeth Netrapalli, Prateek Jain

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present simulations in Section 5 that demonstrate the regret bounds on simple synthetic problems.
Researcher Affiliation	Industry	1Microsoft Research, New York City 2Microsoft Research, India 3The authors are currently at Google Research, India.
Pseudocode	Yes	Algorithm 1 OPTPBCO
Open Source Code	No	The paper does not provide any concrete access to source code for the methodology described.
Open Datasets	No	We present synthetic experiments that showcase the regret bounds established in Section 4. We work with a linear gt for all the experiments. We ﬁx W = Bd(1), context vectors from { xt 2 1}, and the two loss functions (a) ft(w) = ( w, xt y t )2 where y t = w , xt , for a ﬁxed w Bd(1), and (b) ft(w) = \| w, xt y t \|.
Dataset Splits	No	The paper describes generating synthetic data and averaging over '50 problem instances' but does not specify training, validation, or test dataset splits.
Hardware Specification	No	The paper mentions running simulations but does not provide specific hardware details such as GPU/CPU models or memory.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers used for the experiments.
Experiment Setup	Yes	We work with a linear gt for all the experiments. We ﬁx W = Bd(1), context vectors from { xt 2 1}, and the two loss functions (a) ft(w) = ( w, xt y t )2 where y t = w , xt , for a ﬁxed w Bd(1), and (b) ft(w) = \| w, xt y t \|. Additionally, Algorithm 1 specifies parameters: 'Run Kernelized Exponential Weights for PBCO (Algorithm 2) with η = d log(L T) / BT' and 'Run Online Gradient Descent for PBCO (Algorithm 3) with η = W δ / DC T , δ = W DC / 3L 1/2 , α = δ, and T'.