Multi-Agent Intention Progression with Reward Machines

Authors: Michael Dann, Yuan Yao, Natasha Alechina, Brian Logan, John Thangarajah

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach in a range of multi-agent environments, and show that RM-based scheduling out-performs previous intention-aware scheduling approaches in settings where agents are not codesigned.
Researcher Affiliation Academia Michael Dann1 , Yuan Yao2 , Natasha Alechina3 , Brian Logan3,4 and John Thangarajah1 1RMIT University 2University of Nottingham, Ningbo China 3Utrecht University 4 University of Aberdeen
Pseudocode Yes Algorithm 1 Computation of the tactic set.
Open Source Code Yes Full source code and demo videos of the agents are available at: https://github.com/mchldann/IRM IJCAI.
Open Datasets Yes To evaluate our approach, we extend two popular domains from the reward machine literature, Office World [Toro Icarte et al., 2018] and Craft World [Andreas et al., 2017], to a multi-agent setting.
Dataset Splits No The paper does not provide specific dataset split information (percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or test sets, as the environments are simulation-based rather than traditional datasets with fixed splits.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers.
Experiment Setup Yes MCTS is configured to maximise the discounted return, with a discount factor of 0.99. ... In configuring the stochasticity of the external agent s rollouts, we considered τ {0.005, 0.01, 0.015, 0.02} and selected τ = 0.015.