Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs
Authors: Yanlin Han, Piotr Gmytrasiewicz
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our algorithm on the multi-agent tiger problem [7] and UAV reconnaissance problem [2]. The multi-agent tiger game is a generalization of the classic single agent tiger game [11]. It contains additional observations caused by others actions, and the transition and reward functions involve others actions as well. The UAV reconnaissance problem contains a 3x3 grid in which the agent (UAV) tries to capture the moving target [2]. In Figure 2, we see that the intentional I-POMDP approaches has significantly higher reward as agent i perceives more observations, and level-2 I-POMDP performs slightly better than level-1 while level-3 has high variance but at least competes with level-2. The subintentional approach has certain learning ability but is not sophisticated enough to model a rational (level-2 intentional I-POMDP) agent, therefore its performance is worse than all I-POMDP models. |
| Researcher Affiliation | Academia | Yanlin Han Piotr Gmytrasiewicz Department of Computer Science University of Illinois at Chicago Chicago, IL 60607 {yhan37,piotr}@uic.edu |
| Pseudocode | Yes | Algorithm 1: Interactive Belief Update bt k,l = Interactive Belief Update( bt 1 k,l , at 1 k , ot k, l > 0) Algorithm 2: Level-0 Belief Update bt k,0 =Level0Belief Update(θt 1 k,0 , at 1 k , ot k) |
| Open Source Code | No | The paper does not include an unambiguous statement about releasing code or a direct link to a code repository for the described methodology. |
| Open Datasets | No | The paper mentions 'multi-agent tiger problem [7]' and 'UAV reconnaissance problem [2]' as evaluation environments, but does not provide concrete access information (link, DOI, repository, formal citation with authors/year for a publicly available or open dataset) for them as datasets. |
| Dataset Splits | No | The paper does not specify exact dataset split percentages or absolute sample counts for training, validation, or testing. |
| Hardware Specification | No | The computing machine has an Intel Core i5 2GHz, 8GB RAM, and runs mac OS 10.13 and MATLAB R2017. While it mentions Intel Core i5, it does not specify the exact model (e.g., i5-xxxx) or generation, which would be needed for full reproducibility. It also doesn't specify other components like GPUs. |
| Software Dependencies | Yes | The computing machine has an Intel Core i5 2GHz, 8GB RAM, and runs mac OS 10.13 and MATLAB R2017. |
| Experiment Setup | No | The paper does not provide specific experimental setup details such as concrete hyperparameter values, batch sizes, learning rates, or other training configurations. It states 'We firstly fix the modeled agent j to be a level-2 I-POMDP agent and experiment with different modeling approaches for agent i', but lacks numerical details for reproduction. |