Regret Minimization in Stackelberg Games with Side Information
Authors: Keegan Harris, Steven Z. Wu, Maria-Florina F. Balcan
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate the performance of Algorithm 1 and Algorithm 2 on synthetically-generated data. |
| Researcher Affiliation | Academia | Keegan Harris School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 keeganh@cs.cmu.edu Zhiwei Steven Wu School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 zhiweiw@cs.cmu.edu Maria-Florina Balcan School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 ninamf@cs.cmu.edu |
| Pseudocode | Yes | Algorithm 1: Learning with stochastic follower types: full feedback; Algorithm 2: Learning with stochastic contexts: full feedback; Algorithm 3: Learning with stochastic follower types: bandit feedback; Algorithm 4: Learning with stochastic contexts: bandit feedback |
| Open Source Code | No | The paper does not provide an explicit statement about open-sourcing code or a link to a code repository. |
| Open Datasets | No | We empirically evaluate the performance of Algorithm 1 and Algorithm 2 on synthetically-generated data. We consider a setup in which K = 5, A = Af = 3, and the context dimension d = 3. Utility functions are linear in both the context and player actions, and are sampled u.a.r. from [ 1, 1]3 3 3. |
| Dataset Splits | No | The paper describes generating synthetic data for simulations, but does not specify explicit train/validation/test dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the simulations. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9') used for the experiments. |
| Experiment Setup | Yes | We consider a setup in which K = 5, A = Af = 3, and the context dimension d = 3. Utility functions are linear in both the context and player actions, and are sampled u.a.r. from [ 1, 1]3 3 3. We simulate non-stochastic context arrivals in Figure 2a by displaying the same context for T/4 time-steps in a row. Follower types are chosen u.a.r. from each of the five follower types. In Figure 2b, contexts are generated stochastically by sampling each component u.a.r. from [ 1, 1]. Followers are chosen non-stochastically by deterministically cycling over the five types. In Figure 2c, both contexts and follower types are chosen stochastically. Specifically, contexts are generated as in Figure 2b and follower types are generated as in Figure 2a. |