Contracting with a Learning Agent

Authors: Guru Guruganesh, Yoav Kolumbus, Jon Schneider, Inbal Talgam-Cohen, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Joshua Wang, S. Weinberg

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we initiate the study of repeated contracts with learning agents, focusing on those achieving no-regret outcomes. For the canonical setting where the agent s actions result in success or failure, we present a simple, optimal solution for the principal: Initially provide a linear contract with scalar α > 0, then switch to a zero-scalar contract. This shift causes the agent to free-fall through their action space, yielding non-zero rewards for the principal at zero cost. Interestingly, despite the apparent exploitation, there are instances where our dynamic contract can make both players better off compared to the best static contract. We then broaden the scope of our results to general linearly-scaled contracts, and, finally, to the best of our knowledge, we provide the first analysis of optimization against learning agents with uncertainty about the time horizon.
Researcher Affiliation Collaboration Google Research, gurug@google.com, jschnei@google.com, joshuawang@google.com Cornell University, yoav.kolumbus@cornell.edu Tel Aviv University, inbaltalgam@gmail.com Berkeley University, emvlatakis@berkeley.edu Princeton University, smweinberg@princeton.edu
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper is theoretical and does not mention releasing source code for its described methodology, other than a Colab link in Appendix G.1 for verifying a specific theoretical example, which is not for the general methodology.
Open Datasets No The paper is purely theoretical and does not use datasets, thus no information about training data availability is provided.
Dataset Splits No The paper is purely theoretical and does not involve dataset splits for training, validation, or testing.
Hardware Specification No The paper is purely theoretical and does not conduct experiments requiring specific hardware specifications.
Software Dependencies No The paper is purely theoretical and does not conduct experiments that would require detailing specific software dependencies with version numbers.
Experiment Setup No The paper is purely theoretical and does not describe any experimental setup details such as hyperparameters or training configurations.