Regret Minimization in Stackelberg Games with Side Information

Authors: Keegan Harris, Steven Z. Wu, Maria-Florina F. Balcan

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically evaluate the performance of Algorithm 1 and Algorithm 2 on synthetically-generated data.
Researcher Affiliation Academia Keegan Harris School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 keeganh@cs.cmu.edu Zhiwei Steven Wu School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 zhiweiw@cs.cmu.edu Maria-Florina Balcan School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 ninamf@cs.cmu.edu
Pseudocode Yes Algorithm 1: Learning with stochastic follower types: full feedback; Algorithm 2: Learning with stochastic contexts: full feedback; Algorithm 3: Learning with stochastic follower types: bandit feedback; Algorithm 4: Learning with stochastic contexts: bandit feedback
Open Source Code No The paper does not provide an explicit statement about open-sourcing code or a link to a code repository.
Open Datasets No We empirically evaluate the performance of Algorithm 1 and Algorithm 2 on synthetically-generated data. We consider a setup in which K = 5, A = Af = 3, and the context dimension d = 3. Utility functions are linear in both the context and player actions, and are sampled u.a.r. from [ 1, 1]3 3 3.
Dataset Splits No The paper describes generating synthetic data for simulations, but does not specify explicit train/validation/test dataset splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the simulations.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9') used for the experiments.
Experiment Setup Yes We consider a setup in which K = 5, A = Af = 3, and the context dimension d = 3. Utility functions are linear in both the context and player actions, and are sampled u.a.r. from [ 1, 1]3 3 3. We simulate non-stochastic context arrivals in Figure 2a by displaying the same context for T/4 time-steps in a row. Follower types are chosen u.a.r. from each of the five follower types. In Figure 2b, contexts are generated stochastically by sampling each component u.a.r. from [ 1, 1]. Followers are chosen non-stochastically by deterministically cycling over the five types. In Figure 2c, both contexts and follower types are chosen stochastically. Specifically, contexts are generated as in Figure 2b and follower types are generated as in Figure 2a.