Multi-agent active perception with prediction rewards
Authors: Mikko Lauri, Frans Oliehoek
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the empirical usefulness of our results by applying a standard Dec-POMDP algorithm to multi-agent active perception problems, showing increased scalability in the planning horizon. |
| Researcher Affiliation | Academia | Mikko Lauri Department of Computer Science Universität Hamburg Hamburg, Germany lauri@informatik.uni-hamburg.de Frans A. Oliehoek Department of Computer Science TU Delft Delft, the Netherlands f.a.oliehoek@tudelft.nl |
| Pseudocode | Yes | The pseudocode for APAS is shown in Algorithm 1. |
| Open Source Code | Yes | A reference implementation is available at https://github.com/laurimi/multiagent-prediction-reward. |
| Open Datasets | No | The paper mentions evaluating on "the Dec-ρPOMDP domains from [11]: the micro air vehicle (MAV) domain and information gathering rovers domain." However, it does not provide concrete access information (e.g., URL, DOI, or a formal citation for the dataset itself) for these domains, which are described as problem setups rather than downloadable datasets. |
| Dataset Splits | No | The paper mentions running "100 repetitions" and reporting "average policy value and its standard error" but does not specify any training, validation, or test dataset splits. The problem domains are typically reinforcement learning environments, not datasets with explicit splits. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python version, specific library versions, or solver versions). |
| Experiment Setup | Yes | For the MAV problem, we use K = 2, and for the rovers problem K = 5 individual prediction actions. For APAS, we initialize Γ by randomly sampling linearization points bk (S), k = 1, . . . , K using [23]. The corresponding α-vectors for the negative entropy are αk(s) = ln bk(s). We run 100 repetitions using APAS and report the average policy value and its standard error. |