Understanding Model Selection for Learning in Strategic Environments

Authors: Tinashe Handina, Eric Mazumdar

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show through simple theoretical models, illustrative examples, and experiments that strategic interactions can yield a non-trivial relationship between model class expressivity and equilibrium performance. In particular, we show how even in highly structured regimes in which one has full access to the underlying data distribution strategic interactions can result in a Braess paradox-like phenomenon: the larger and more expressive the model class a learner optimizes over, the lower their performance at equilibrium. To make this result concrete, we give examples of strategic regression, strategic classification, and MARL in which reverse scaling occurs.
Researcher Affiliation Academia Tinashe Handina Computing + Mathematical Sciences California Institute of Technology Pasadena, CA 91125 thandina@caltech.edu Eric Mazumdar Computing + Mathematical Sciences California Institute of Technology Pasadena, CA 91125 mazumdar@caltech.edu
Pseudocode Yes Algorithm 1 Stochastic gradient descent to find Nash equilibrium in a strongly monotone game; Algorithm 2 Successive Elimination for Best Design action Identification
Open Source Code No Our contributions are primarily theoretical in nature. The experimental evaluations provided in this paper do not rely on private datasets and can be easily reproduced with the provided settings and parameters. The answer NA means that paper does not include experiments requiring code.
Open Datasets No Example 1: Multi-Agent Reinforcement Learning We first demonstrate an extreme form of the reverse scaling predicted by Theorem 3.4 in the context of multi-agent reinforcement learning. To do so, we construct a Markov game in which the more the learner restricts their policy class, the more their expected payoff increases. ... Example 2: Participation dynamics In this model, there is a base distribution P0 over the input-output space X Y where X is feature space and Y is the output space.
Dataset Splits No The paper does not specify exact training/test/validation dataset splits or refer to predefined splits with citations for reproducibility.
Hardware Specification No We focus on theoretical contributions. All of the compute is not sophisticated and is not intense.
Software Dependencies No The paper mentions 'Nash Q learning [50]' as a method, but does not provide specific version numbers for any software libraries, frameworks, or environments used in the experiments.
Experiment Setup Yes Example 1: Multi-Agent Reinforcement Learning... as their policy class is restricted to take the form πl(s) = [p, 1 p] in all states s for p [1 p, p] for different discount factors (assumed to be the same for both players). ... Example 3: Strategic Linear Regression... For this example, the set C = {e Rd : e = k β β for k [ 10, 10]}.