Regret-Based Multi-Agent Coordination with Uncertain Task Rewards

Authors: Feng Wu, Nicholas Jennings

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments, we show that our method can scale up to task allocation domains with hundreds of agents and tasks (intractable for centralized methods) and can outperform state-of-the-art decentralized approaches by having much higher values and lower regrets.
Researcher Affiliation Academia Feng Wu School of Electronics and Computer Science University of Southampton, United Kingdom fw6e11@ecs.soton.ac.uk Nicholas R. Jennings School of Electronics and Computer Science University of Southampton, United Kingdom nrj@ecs.soton.ac.uk
Pseudocode Yes Algorithm 1: Iterative Constraint Generation Max-Sum
Open Source Code No The paper does not provide any links to open-source code or make an explicit statement about code availability.
Open Datasets No The paper describes developing a simulator and generating tasks and states randomly for the experiments, rather than using a publicly available dataset with access information. 'We developed a simulator for the above scenario, in which tasks with any 4 types of targets (i.e., food, animal, victim, and fuel) were randomly generated on a 2D grid map.'
Dataset Splits No The paper describes setting up a simulation environment and randomizing aspects of the problem instances, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or counts).
Hardware Specification Yes We ran our experiments on a machine with a 2.66GHZ Intel Core 2 Duo and 4GB memory.
Software Dependencies Yes All the algorithms were implemented in Java 1.6, and the linear programs are solved by CPLEX 12.4.
Experiment Setup Yes We developed a simulator for the above scenario, in which tasks with any 4 types of targets (i.e., food, animal, victim, and fuel) were randomly generated on a 2D grid map. ... we randomized the requirements of each target type and kept them fixed for each instance. ... For each state sj Sj, we specified a utility Uj(sj, xj) for the responders doing the task in a given state... For each instance, we defined a Markov chain for the states of each task with the transition matrix randomly initialized.