Privacy-Preserving Policy Iteration for Decentralized POMDPs

Authors: Feng Wu, Shlomo Zilberstein, Xiaoping Chen

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results on several common Dec-POMDP benchmark problems confirm the effectiveness of our approach. We implemented our algorithm and tested it on 6 common benchmark problems2 for Dec-POMDPs (i.e., Dec-Tiger, Broadcast Channel, Meeting in a 3 3 Grid, Box Pushing, Recycling Robots, and Mars Rovers). For each problem instance, we first ran our algorithm to generate policies and then evaluated the policies by simulation.
Researcher Affiliation Academia Feng Wu, Shlomo Zilberstein, Xiaoping Chen School of Computer Science and Technology, University of Science and Technology of China, CHN College of Information and Computer Sciences, University of Massachusetts Amherst, USA wufeng02@ustc.edu.cn, shlomo@cs.umass.edu, xpchen@ustc.edu.cn
Pseudocode Yes Algorithm 1: Secure Value Estimation. Algorithm 2: Secure Policy Improvement.
Open Source Code No The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We implemented our algorithm and tested it on 6 common benchmark problems2 for Dec-POMDPs (i.e., Dec-Tiger, Broadcast Channel, Meeting in a 3 3 Grid, Box Pushing, Recycling Robots, and Mars Rovers). ... 2http://masplan.org/problem domains
Dataset Splits No The paper mentions running 'trials' and selecting 'Nb samples' but does not specify explicit training/validation/test dataset splits, percentages, or sample counts.
Hardware Specification No The paper mentions 'a machine with 40 cores' but does not provide specific hardware details such as CPU models, GPU types, or memory specifications.
Software Dependencies No The paper mentions 'the common Java implementation of the Paillier cryptosystem' but does not provide specific version numbers for Java or the cryptosystem library.
Experiment Setup Yes In the algorithm, we set the number of trials N = 1000 and the size of best trials Nb = 10.