Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Lipschitz Bandits in Optimal Space
Authors: Xiaoyi Zhu, Zengfeng Huang
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also conduct numerical simulations, and the results show that our new algorithm achieves regret comparable to the state-of-the-art while reducing memory usage by orders of magnitude. In this section, we evaluate the Log-Li algorithm. |
| Researcher Affiliation | Academia | Xiaoyi Zhu School of Data Science Fudan University Shanghai, China EMAIL; Zengfeng Huang School of Data Science Fudan University Shanghai, China EMAIL |
| Pseudocode | Yes | The learning process is summarized in Algorithm 1 and 2. Algorithm 1: Logarithmic Space Lipschitz for Each round (Round Func) Input: Time horizon T; current time t; maximum depth m; current depth h; current cube C; comparison arm reward µm 1; current round max reward µm. Algorithm 2: Logarithmic Space Lipschitz(Log-Li) Input: Arm set A = [0, 1]d; time horizon T. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | In the experiment, the time horizon T = 100, 000, the arm space is [0, 1]2 and the expect reward function is µ(x) = 1 x x1 2 0.5 x x2 2 for different values of x1 and x2. The paper uses a synthetic reward function for its experiments rather than an external, publicly available dataset. |
| Dataset Splits | No | The paper uses a synthetic reward function for its experiments rather than an external dataset, so no dataset splits (training/test/validation) are applicable or mentioned. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in the implementation of the algorithms or experiments. |
| Experiment Setup | Yes | In the experiment, the time horizon T = 100, 000, the arm space is [0, 1]2 and the expect reward function is µ(x) = 1 x x1 2 0.5 x x2 2 for different values of x1 and x2. |