Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Principal-Agent Bandit Games with Self-Interested and Exploratory Learning Agents
Authors: Junyan Liu, Lillian J. Ratliff
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper studies the repeated principal-agent bandit game, where the principal indirectly explores an unknown environment by incentivizing an agent to play arms. We propose algorithms for both i.i.d. and linear reward settings with bandit feedback in a finite horizon T, achieving regret bounds of e O(T) and e O(T 2/3), respectively. |
| Researcher Affiliation | Academia | 1Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA USA 2Electrical & Computer Engineering, University of Washington, Seattle, WA USA. |
| Pseudocode | Yes | Algorithm 1 Proposed algorithm for i.i.d. reward |
| Open Source Code | No | The paper does not contain any statement about code availability or links to code repositories. |
| Open Datasets | No | The paper describes a theoretical framework for principal-agent bandit games and does not use any specific dataset for empirical evaluation. |
| Dataset Splits | No | The paper focuses on theoretical algorithm design and regret bounds for bandit games, and does not involve empirical evaluation on datasets, thus no dataset splits are provided. |
| Hardware Specification | No | The paper presents theoretical algorithms and regret analysis for bandit games, without conducting empirical experiments that would require specific hardware specifications. |
| Software Dependencies | No | The paper focuses on theoretical algorithm design and provides mathematical analysis, without detailing specific software or library versions used for implementation or simulation. |
| Experiment Setup | No | The paper presents a theoretical study of principal-agent bandit games, focusing on algorithm design and regret bounds, and therefore does not describe any experimental setup or hyperparameters. |