Contracting with a Learning Agent
Authors: Guru Guruganesh, Yoav Kolumbus, Jon Schneider, Inbal Talgam-Cohen, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Joshua Wang, S. Weinberg
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we initiate the study of repeated contracts with learning agents, focusing on those achieving no-regret outcomes. For the canonical setting where the agent s actions result in success or failure, we present a simple, optimal solution for the principal: Initially provide a linear contract with scalar α > 0, then switch to a zero-scalar contract. This shift causes the agent to free-fall through their action space, yielding non-zero rewards for the principal at zero cost. Interestingly, despite the apparent exploitation, there are instances where our dynamic contract can make both players better off compared to the best static contract. We then broaden the scope of our results to general linearly-scaled contracts, and, finally, to the best of our knowledge, we provide the first analysis of optimization against learning agents with uncertainty about the time horizon. |
| Researcher Affiliation | Collaboration | Google Research, gurug@google.com, jschnei@google.com, joshuawang@google.com Cornell University, yoav.kolumbus@cornell.edu Tel Aviv University, inbaltalgam@gmail.com Berkeley University, emvlatakis@berkeley.edu Princeton University, smweinberg@princeton.edu |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper is theoretical and does not mention releasing source code for its described methodology, other than a Colab link in Appendix G.1 for verifying a specific theoretical example, which is not for the general methodology. |
| Open Datasets | No | The paper is purely theoretical and does not use datasets, thus no information about training data availability is provided. |
| Dataset Splits | No | The paper is purely theoretical and does not involve dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is purely theoretical and does not conduct experiments requiring specific hardware specifications. |
| Software Dependencies | No | The paper is purely theoretical and does not conduct experiments that would require detailing specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is purely theoretical and does not describe any experimental setup details such as hyperparameters or training configurations. |