Contextual Dynamic Pricing with Unknown Noise: Explore-then-UCB Strategy and Improved Regrets

Authors: Yiyun Luo, Will Wei Sun, Yufeng Liu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct numerical experiments to support our theoretical regret bounds of Ex UCB under both Case (A) and Case (B). ... We plot the log-log scale of average accumulative regrets versus the time periods in Figure 2 along with the 95% confidence intervals. The linear fits extract a slope of 0.670 for Case (A) and a slope of 0.724 for Case (B), which indicates that our proved regrets of O(T 2/3) for Case (A) and O(T 3/4) for Case (B) are sharp.
Researcher Affiliation Academia Yiyun Luo Department of Statistics and Operations Research University of North Carolina at Chapel Hill Chapel Hill, NC 27599 yiyun851@ad.unc.edu Will Wei Sun Krannert School of Management Purdue University West Lafayette, IN 47907 sun244@purdue.edu Yufeng Liu Department of Statistics and Operations Research Department of Genetics Department of Biostatistics University of North Carolina at Chapel Hill Chapel Hill, NC 27599 yfliu@email.unc.edu
Pseudocode Yes Algorithm 1 Explore-then-UCB (Ex UCB)... Algorithm 2 Inner UCB Algorithm
Open Source Code Yes We included all codes in the supplementary material. The codes are accompanied with instant explanations. We also provided a Readme document with instructions on running the codes to reproduce our results.
Open Datasets No Our numerical experiments only used simulated data. The simulated data has no relations with natural sciences or human subjects. The numerical simulation is only for validating our theoretical results.
Dataset Splits No As we considered online learning problems, we did not pre-train the algorithms. However, we did specify the hyperparameters and the way they were chosen in Section 6. The paper uses simulated data generated sequentially and does not describe explicit train/validation/test splits typical for offline supervised learning.
Hardware Specification No We ran all numerical experiments on a laptop. We reported the time consumed for each replication in our codes. No specific hardware models (e.g., CPU, GPU) or detailed specifications are provided.
Software Dependencies No The paper does not mention any specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks).
Experiment Setup Yes For both cases, we set the constants in Algorithm 1 as pmax = 50, B = 50, C1 = 1, C2 = 20, λ = 0.1. ... we specify the linear parameter θ0 = 30 and sample the i.i.d. covariates as xt Unif(1/2, 1). For Case (A), the noise distribution is set as a Uniform mixture 3 4Unif( 15, 0) + 1 4Unif(0, 15); while for Case (B), we adopt another Uniform mixture 1 4Unif( 15, 0) + 3 4Unif(0, 15).