Lipschitz Bandits with Batched Feedback
Authors: Yasong Feng, zengfeng Huang, Tianyu Wang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present numerical studies of A-BLi N. In the experiments, we use the arm space A = [0, 1]2 and the expected reward function µ(x) = 1 1 2 x x1 2 3 10 x x2 2, where x1 = (0.8, 0.7) and x2 = (0.1, 0.1). The landscape of µ and the resulting partition is shown in Figure 2(a). |
| Researcher Affiliation | Academia | Yasong Feng Shanghai Center for Mathematical Sciences Fudan University ysfeng20@fudan.edu.cnZengfeng Huang School of Data Science Fudan University huangzf@fudan.edu.cnTianyu Wang Shanghai Center for Mathematical Sciences Fudan University wangtianyu@fudan.edu.cn |
| Pseudocode | Yes | Algorithm 1 Batched Lipschitz Narrowing (BLi N) |
| Open Source Code | Yes | Our code is available at https://github.com/Feng Yasong-fifol/Batched-Lipschitz-Narrowing. |
| Open Datasets | No | The paper defines a synthetic 'arm space A = [0, 1]2 and the expected reward function µ(x)' for its experiments, rather than using a publicly available dataset that would require access information. Therefore, no access information for a public dataset is provided or applicable. |
| Dataset Splits | No | The paper describes a simulated bandit problem rather than experiments on a dataset with traditional train/validation/test splits. Therefore, no specific dataset split information is provided. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It mentions theoretical time complexity but no actual hardware used. |
| Software Dependencies | No | The paper states 'Our code is available at https://github.com/Feng Yasong-fifol/Batched-Lipschitz-Narrowing', and mentions in the ethics checklist that training details were specified, but it does not list specific software dependencies with version numbers within the main text or any directly quoted section. |
| Experiment Setup | Yes | In the experiments, we use the arm space A = [0, 1]2 and the expected reward function µ(x) = 1 1 2 x x1 2 3 10 x x2 2, where x1 = (0.8, 0.7) and x2 = (0.1, 0.1). ... We let the time horizon T = 80000... For this experiment, r1 = 1 8, r3 = 1 16, r4 = 1 32, which is the ACE sequence (rounded as in Remark 2) for d = 2 and dz = 0. |