Payoff-based Learning with Matrix Multiplicative Weights in Quantum Games

Authors: Kyriakos Lotidis, Panayotis Mertikopoulos, Nicholas Bambos, Jose Blanchet

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this last section, we provide numerical simulations to validate and explore the performance of (MMW) with payoff-based feedback. Additional experiments can be found in Appendix E.
Researcher Affiliation Academia Kyriakos Lotidis Stanford University klotidis@stanford.edu; Panayotis Mertikopoulos Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG 38000 Grenoble, France & Archimedes RU, NKUA panayotis.mertikopoulos@imag.fr; Nicholas Bambos Stanford University bambos@stanford.edu; Jose Blanchet Stanford University jose.blanchet@stanford.edu
Pseudocode Yes Algorithm 1: MMW with bandit feedback
Open Source Code No The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets No The paper describes a simulated game setup: 'Our testbed is a two-player zero-sum quantum game, which is the quantum analogue of a 2x2 min-max game with actions {a1, a2} and {b1, b2}, and payoff matrix P='. This is a custom-defined game environment, not a public dataset with explicit access information.
Dataset Splits No The paper describes an online learning process within a game theory context. It does not mention or utilize specific train/validation/test dataset splits in the conventional machine learning sense, as it's not evaluating models on static datasets.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the numerical experiments (e.g., CPU, GPU, or memory specifications).
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch, TensorFlow, or specific solvers).
Experiment Setup Yes All the runs for the three different methods were initialized for Y = 0 and we used γ= 10−2 for all methods. In particular, for (3MW) with gradient estimates given by (2PE) estimator, we used a sampling radius δ= 10−2, and for (3MW) with (1PE) estimator, we used δ= 10−1 (in tune with our theoretical results which suggest the use of a tighter sampling radius when mixed payoff information is available to the players).