reproducibilityindex.ai

Integrated Cooperation and Competition in Multi-Agent Decision-Making

Authors: Kyle Wray, Akshat Kumar, Shlomo Zilberstein

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments consider two GD-CCPs which both uniquely combine the spirit of the standard cooperative domain Meeting in a Grid (Amato, Bernstein, and Zilberstein 2010) with the spirit of two competitive domains: Battle of the Sexes and Prisoner s Dilemma (Fudenberg and Tirole 1991). We evaluate our approximate CCP algorithm in simulation with three different amounts of controller nodes with \|Qi\| {2,4,6} for each agent i in Figure 2. In each scenario, we evaluate the average discounted reward (ADR) vs. the allotted slack (δ). The ADR averages over 1000 trials for each scenario (i.e., each point). The standard error is provided as error bars. We implemented the Prisoner Meeting and Battle Meeting domains on two real robot platforms (Figure 3).
Researcher Affiliation	Academia	Kyle Hollins Wray,1 Akshat Kumar,2 Shlomo Zilberstein1 1 College of Information and Computer Sciences, University of Massachusetts, Amherst, MA, USA 2 School of Information Systems, Singapore Management University, Singapore
Pseudocode	Yes	Algorithm 1 presents a scalable FSC solution to CCPs that assumes a given tuple of ﬁxed size FSC nodes Q. ... Algorithm 1 Approximate FSC Solution to GD-CCP
Open Source Code	No	Towards this goal, we will provide our source code to support further development of models that generalize cooperation and competition under a uniﬁed approach.
Open Datasets	No	Our experiments consider two GD-CCPs which both uniquely combine the spirit of the standard cooperative domain Meeting in a Grid (Amato, Bernstein, and Zilberstein 2010) with the spirit of two competitive domains: Battle of the Sexes and Prisoner s Dilemma (Fudenberg and Tirole 1991). We consider two novel domains called Battle Meeting and Prisoner Meeting. In both, there are two agents I ={1,2} and the state space is S =S1 S2 with Si ={top left, top right, bottom left, bottom right}. It has action space A=A1 A2 with Ai ={none, north, south, east, west} and observation space Ω={no bump, bump}. The state transitions T are deﬁned as found in Figure 1.
Dataset Splits	No	We evaluate our approximate CCP algorithm in simulation with three different amounts of controller nodes with \|Qi\| {2,4,6} for each agent i in Figure 2. In each scenario, we evaluate the average discounted reward (ADR) vs. the allotted slack (δ). The ADR averages over 1000 trials for each scenario (i.e., each point).
Hardware Specification	No	We solve the NLPs and CCLPs in Tables 1 and 2 using the NEOS Server (Czyzyk, Mesnier, and Mor e 1998) running SNOPT (Gill, Murray, and Saunders 2005). We implemented the Prisoner Meeting and Battle Meeting domains on two real robot platforms (Figure 3).
Software Dependencies	No	We solve the NLPs and CCLPs in Tables 1 and 2 using the NEOS Server (Czyzyk, Mesnier, and Mor e 1998) running SNOPT (Gill, Murray, and Saunders 2005).
Experiment Setup	Yes	We evaluate our approximate CCP algorithm in simulation with three different amounts of controller nodes with \|Qi\| {2,4,6} for each agent i in Figure 2. All other normal cases terminate when maxj dj <ϵ=0.01 as in Algorithm 1. We allowed for a maximum of 50 iterations, occasionally causing early termination of the best response dynamics. Lastly, we have a discount factor of γ =0.95.