reproducibilityindex.ai

Online Stochastic Linear Optimization under One-bit Feedback

Authors: Lijun Zhang, Tianbao Yang, Rong Jin, Yichi Xiao, Zhi-hua Zhou

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present experimental results to demonstrate the effectiveness of the proposed algorithm.
Researcher Affiliation	Collaboration	Lijun Zhang ZHANGLJ@LAMDA.NJU.EDU.CN National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China Tianbao Yang TIANBAO-YANG@UIOWA.EDU Department of Computer Science, The University of Iowa, Iowa City, IA 52242, USA Rong Jin JINRONG.JR@ALIBABA-INC.COM Alibaba Group, Seattle, USA Yichi Xiao XIAOYC@LAMDA.NJU.EDU.CN Zhi-Hua Zhou ZHOUZH@LAMDA.NJU.EDU.CN National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China
Pseudocode	Yes	Algorithm 1 Online Learning for Logit Model (OL2M)
Open Source Code	No	The paper does not provide any concrete access to source code for the methodology described, nor does it explicitly state that the code will be released.
Open Datasets	No	We sample a point uniformly at random from the (d 1)-sphere as w, and each time the learner submits an action xt, a one-bit feedback yt { 1} is generated according to the logit model in (3). [...] The decision set D Rd is constructed by sampling 10d points uniformly at random from the (d 1)-sphere.
Dataset Splits	No	The paper describes an online learning setting where data is revealed sequentially. It does not explicitly define or provide details for traditional training, validation, and test dataset splits.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU models, CPU models, or memory) used for running the experiments.
Software Dependencies	No	The paper mentions the use of 'CVX package' but does not specify a version number for it, nor does it list other software dependencies with specific versions.
Experiment Setup	Yes	To apply our algorithm, we need to determine the values of two parameters: λ and γt. λ is introduced to make Zt invertible, and the performance of our algorithm is insensitive to its value. Thus, we simply choose λ = 1 in the following. γt is an essential parameter which is the width of the conﬁdence region, and its value is tuned as c log det(Zt)/det(Z1) according to (12), where c is searched in the range of [1e 3, 1].