Contextual Dynamic Pricing with Unknown Noise: Explore-then-UCB Strategy and Improved Regrets
Authors: Yiyun Luo, Will Wei Sun, Yufeng Liu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct numerical experiments to support our theoretical regret bounds of Ex UCB under both Case (A) and Case (B). ... We plot the log-log scale of average accumulative regrets versus the time periods in Figure 2 along with the 95% confidence intervals. The linear fits extract a slope of 0.670 for Case (A) and a slope of 0.724 for Case (B), which indicates that our proved regrets of O(T 2/3) for Case (A) and O(T 3/4) for Case (B) are sharp. |
| Researcher Affiliation | Academia | Yiyun Luo Department of Statistics and Operations Research University of North Carolina at Chapel Hill Chapel Hill, NC 27599 yiyun851@ad.unc.edu Will Wei Sun Krannert School of Management Purdue University West Lafayette, IN 47907 sun244@purdue.edu Yufeng Liu Department of Statistics and Operations Research Department of Genetics Department of Biostatistics University of North Carolina at Chapel Hill Chapel Hill, NC 27599 yfliu@email.unc.edu |
| Pseudocode | Yes | Algorithm 1 Explore-then-UCB (Ex UCB)... Algorithm 2 Inner UCB Algorithm |
| Open Source Code | Yes | We included all codes in the supplementary material. The codes are accompanied with instant explanations. We also provided a Readme document with instructions on running the codes to reproduce our results. |
| Open Datasets | No | Our numerical experiments only used simulated data. The simulated data has no relations with natural sciences or human subjects. The numerical simulation is only for validating our theoretical results. |
| Dataset Splits | No | As we considered online learning problems, we did not pre-train the algorithms. However, we did specify the hyperparameters and the way they were chosen in Section 6. The paper uses simulated data generated sequentially and does not describe explicit train/validation/test splits typical for offline supervised learning. |
| Hardware Specification | No | We ran all numerical experiments on a laptop. We reported the time consumed for each replication in our codes. No specific hardware models (e.g., CPU, GPU) or detailed specifications are provided. |
| Software Dependencies | No | The paper does not mention any specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | For both cases, we set the constants in Algorithm 1 as pmax = 50, B = 50, C1 = 1, C2 = 20, λ = 0.1. ... we specify the linear parameter θ0 = 30 and sample the i.i.d. covariates as xt Unif(1/2, 1). For Case (A), the noise distribution is set as a Uniform mixture 3 4Unif( 15, 0) + 1 4Unif(0, 15); while for Case (B), we adopt another Uniform mixture 1 4Unif( 15, 0) + 3 4Unif(0, 15). |