State-Constrained Zero-Sum Differential Games with One-Sided Information
Authors: Mukesh Ghimire, Lei Zhang, Zhe Xu, Yi Ren
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We use a simplified football game to demonstrate the utility of this work, where we reveal player positions and belief states in which the attacker should (or should not) play specific random deceptive moves to take advantage of information asymmetry, and compute how the defender should respond. In Sec. 6, we solve an 8D man-to-man matchup game and reveal player positions in which the attacker can take advantage of information asymmetry by playing specific deceptive moves, and to derive the defender s best response in the lack of information. See Fig. 2. |
| Researcher Affiliation | Academia | 1Department of Mechanical and Aerospace Engineering, Arizona State University, Tempe, AZ, USA. Correspondence to: Yi Ren <yiren@asu.edu>. |
| Pseudocode | Yes | Alg. 1 summarizes the value approximation algorithm. |
| Open Source Code | Yes | The code for the implementation is available at https://github.com/ghimiremukesh/OSIIG. |
| Open Datasets | No | The paper describes using a "simplified football game" and sampling states from Q(t) and P to generate data for training. It does not refer to a publicly available, established dataset. |
| Dataset Splits | No | The paper mentions training data for value approximation but does not explicitly specify train/validation/test dataset splits with percentages, counts, or references to predefined splits. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications used for running the experiments. It mentions using "56 CPU cores" for parallel computation but no specific CPU model. |
| Software Dependencies | No | The paper mentions software components like "Physics-Informed Neural Network (PINN)", "Adam optimizer", and general references to "Python" or "PyTorch" through code availability, but it does not specify any version numbers for these software dependencies. |
| Experiment Setup | Yes | The value network uses PICNN with 5 hidden layers and 256 neurons each and has 9-dimensional inputs (state and belief). We train 10 separate networks for each time step starting from t = 0.9 with τ = 0.1, each being trained for 10 epochs. For each epoch, S(t) includes 5000 states sampled from Q(t). The PINN utilizes a fully-connected network with 3 hidden layers, each comprising 512 neurons with sin activation function. The network adopts the Adam optimizer with a fixed learning rate of 2 10 5. |