Multi-Agent Learning from Learners

Authors: Mine Melodi Caliskan, Francesco Chini, Setareh Maghsudi

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically test MA-Lf L and we observe high positive correlation between the recovered reward functions and the ground truth. We test MA-Lf L experimentally in a 3 3 deterministic grid world environments.
Researcher Affiliation Academia 1Department of Computer Science, University of Tuebingen, T ubingen, Germany.
Pseudocode Yes Algorithm 1 Multi-agent Soft Policy Iteration (MA-SPI) ... Algorithm 2 Multi-agent Learning from a Learner (MALf L)
Open Source Code Yes the source code is available at Git Hub 1. 1https://github.com/melodi Cyb/multiagent-learning-from-learners
Open Datasets No We test MA-Lf L experimentally in a 3 3 deterministic grid world environments. ... The paper does not provide access information or citations for this grid world environment/dataset.
Dataset Splits No The paper mentions running experiments in a 3x3 grid world environment but does not specify any training, validation, or test dataset splits.
Hardware Specification Yes We execute all experiments under a Conda environment using Python with a computation unit GPU-2080i
Software Dependencies No We execute all experiments under a Conda environment using Python with a computation unit GPU-2080i. The paper mentions "Python" but does not specify a version or any other software dependencies with version numbers.
Experiment Setup Yes Table 3. Parameters to reproduce results for MA-Lf L in Grid World scenario in Section 7 Table 1. This table includes specific parameter values such as Alpha 3, Beta 0.1, Gamma 0.9, Episode Length 1000, Iteration # 10, Episode # 3000, Entropy Coefficient 0.3, Adam Learning Rate 0.1, Adam Epoch # 10, Reward Adam Epoch # 1000, Reward Adam Learning Rate 0.01.