Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Lasso Bandit with Compatibility Condition on Optimal Arm
Authors: Harin Lee, Taehyun Hwang, Min-hwan Oh
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We propose an algorithm that adapts the forced-sampling technique and prove that the proposed algorithm achieves O(poly log d T) regret under the margin condition. Through numerical experiments, we confirm the superior performance of our proposed algorithm. |
| Researcher Affiliation | Academia | Harin Lee Seoul National University EMAIL Taehyun Hwang Seoul National University EMAIL Min-hwan Oh Seoul National University EMAIL |
| Pseudocode | Yes | Algorithm 1 FS-WLasso (Forced-Sampling then Weighted Loss Lasso) ... Algorithm 2 FS-Lasso (Forced Sampling with Lasso) |
| Open Source Code | Yes | We have also included the data and code, along with instructions to reproduce the main experimental results, in the supplementary material. |
| Open Datasets | No | We perform numerical evaluations on synthetic datasets. We set d = 100, T = 2000, and ηt N(0, 0.25). For given s0, we sample S0 uniformly from all subsets of [d] with size s0, then sample β S0 uniformly from a s0-dimensional unit sphere. |
| Dataset Splits | No | The paper uses 'synthetic datasets' and describes how the data is generated for numerical evaluations. It specifies parameters like d=100, T=2000, and noise distribution. However, it does not mention specific train/test/validation splits for the generated data, which is typical for sequential decision-making problems like bandit settings rather than conventional supervised learning. |
| Hardware Specification | Yes | All experiments were held in a computing cluster with twenty Intel(R) Xeon(R) Silver 4210R CPUs and 187 GB of RAM. |
| Software Dependencies | No | The paper does not mention any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or specific libraries). |
| Experiment Setup | Yes | For both experiments, we set d = 100, T = 2000, and ηt N(0, 0.25). For given s0, we sample S0 uniformly from all subsets of [d] with size s0, then sample β S0 uniformly from a s0-dimensional unit sphere. We tune the hyper-parameters of each algorithm to achieve their best performance. ... in Experiment 1, we set M0 = 10 and w = 1. ... In Experiment 2, We set the fixed sub-optimal arms to have expected rewards of 0.1, 0.2, . . . , 0.9, and sample the expected reward of the optimal arm from Unif(0.9, 1). |