Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Lasso Bandit with Compatibility Condition on Optimal Arm

Authors: Harin Lee, Taehyun Hwang, Min-hwan Oh

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose an algorithm that adapts the forced-sampling technique and prove that the proposed algorithm achieves O(poly log d T) regret under the margin condition. Through numerical experiments, we confirm the superior performance of our proposed algorithm.
Researcher Affiliation Academia Harin Lee Seoul National University EMAIL Taehyun Hwang Seoul National University EMAIL Min-hwan Oh Seoul National University EMAIL
Pseudocode Yes Algorithm 1 FS-WLasso (Forced-Sampling then Weighted Loss Lasso) ... Algorithm 2 FS-Lasso (Forced Sampling with Lasso)
Open Source Code Yes We have also included the data and code, along with instructions to reproduce the main experimental results, in the supplementary material.
Open Datasets No We perform numerical evaluations on synthetic datasets. We set d = 100, T = 2000, and ηt N(0, 0.25). For given s0, we sample S0 uniformly from all subsets of [d] with size s0, then sample β S0 uniformly from a s0-dimensional unit sphere.
Dataset Splits No The paper uses 'synthetic datasets' and describes how the data is generated for numerical evaluations. It specifies parameters like d=100, T=2000, and noise distribution. However, it does not mention specific train/test/validation splits for the generated data, which is typical for sequential decision-making problems like bandit settings rather than conventional supervised learning.
Hardware Specification Yes All experiments were held in a computing cluster with twenty Intel(R) Xeon(R) Silver 4210R CPUs and 187 GB of RAM.
Software Dependencies No The paper does not mention any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or specific libraries).
Experiment Setup Yes For both experiments, we set d = 100, T = 2000, and ηt N(0, 0.25). For given s0, we sample S0 uniformly from all subsets of [d] with size s0, then sample β S0 uniformly from a s0-dimensional unit sphere. We tune the hyper-parameters of each algorithm to achieve their best performance. ... in Experiment 1, we set M0 = 10 and w = 1. ... In Experiment 2, We set the fixed sub-optimal arms to have expected rewards of 0.1, 0.2, . . . , 0.9, and sample the expected reward of the optimal arm from Unif(0.9, 1).