reproducibilityindex.ai

Online Learning and Pricing with Reusable Resources: Linear Bandits with Sub-Exponential Rewards

Authors: Huiwen Jia, Cong Shi, Siqian Shen

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implement four pricing algorithms: BLin UCB and three benchmark policies with epsilon = 0.3, 0.2, and 0.1, i.e., the probability for conducting exploration. We compare the results of the above four pricing policies with state-independent optimal price (OPT). We present two figures for the results of each instance (see Figure 2): the first row shows the offered price over periods of each algorithm and the second row depicts the cumulative time-average relaxed regret, i.e., (Pt t =1 JLP t Pt t =1 Jpi t )/t.
Researcher Affiliation	Academia	1Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI 48109.
Pseudocode	Yes	Algorithm 1 Online Batch Lin UCB Algorithm (BLin UCB). Algorithm 2 epsilon-greedy Benchmark.
Open Source Code	No	The paper does not provide any explicit statements about making the source code available or links to a code repository.
Open Datasets	No	The paper describes generating data for its experiments ('The total operation time horizon is 8000 periods and the capacity of the reusable resource is c = 100. We choose the price from a fixed range of [10, 18]... We consider three scenarios...'), but it does not specify the use of any publicly available datasets nor does it provide access information (e.g., links, DOIs, citations) for the generated data.
Dataset Splits	No	The paper describes its numerical experiments over a 'total operation time horizon' but does not specify explicit training, validation, or test dataset splits.
Hardware Specification	No	The paper describes the parameters of its numerical experiments (e.g., 'total operation time horizon is 8000 periods', 'capacity of the reusable resource is c = 100'), but it does not provide any specific details about the hardware (e.g., CPU, GPU, memory, cloud instances) used to run these simulations.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., specific programming languages, libraries, or solvers and their versions) used for implementation or experimentation.
Experiment Setup	Yes	The total operation time horizon is 8000 periods and the capacity of the reusable resource is c = 100. We choose the price from a fixed range of [10, 18]... We consider a three-dimensional feature vector (p, φ(p), 1)... We consider three scenarios of the arrival rates associated with candidate prices and thus the corresponding system dynamics (three instances correspondingly)... For each instance, we implement four pricing algorithms: BLin UCB and three benchmark policies with epsilon = 0.3, 0.2, and 0.1...