Open Ad Hoc Teamwork with Cooperative Game Theory
Authors: Jianhong Wang, Yang Li, Yuan Zhang, Wei Pan, Samuel Kaski
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The demos of experimental results are available on https://sites.google. com/view/ciao2024, and the code of experiments is published on https://github. com/hsvgbkhgbv/CIAO. and We conduct experiments, primarily comparing two instances of CIAO (CIAO-S and CIAO-C) based on GPL framework in two environments: Level-based Foraging (LBF) and Wolfpack under open team settings (Rahman et al., 2021). |
| Researcher Affiliation | Academia | 1Center for AI Fundamentals, University of Manchester, UK 2Neurorobotics Lab, University of Freiburg, Germany 3Aalto University, Finland. Correspondence to: Jianhong Wang <jianhong.wang@manchester.ac.uk>. |
| Pseudocode | Yes | Algorithm 1 Overall training procedure of CIAO |
| Open Source Code | Yes | and the code of experiments is published on https://github. com/hsvgbkhgbv/CIAO. |
| Open Datasets | Yes | We assess the effectiveness of the proposed algorithms CIAO-S and CIAO-C in two established environments, LBF and Wolfpack, featuring open team settings (Rahman et al., 2021). and The teammate policies adhere to the experimental settings used for testing GPL (Rahman et al., 2021), which encompass a range of heuristic policies and pre-trained policies. For more detailed information about teammate policies, we recommend referring to Appendix B.4 of GPL s paper. |
| Dataset Splits | No | The paper mentions training and testing phases ('the learner is trained in an environment with a maximum of 3 agents at each timestep. Subsequently, testing is conducted in environments with a maximum of 5 and 9 agents at each timestep...'), but does not explicitly detail a separate 'validation' dataset split with specific percentages or counts. |
| Hardware Specification | Yes | All experiments have been run on Xeon Gold 6230 with 30 CPU cores and 30 GB primary memory. |
| Software Dependencies | No | All algorithms in experiments are implemented in Py Torch (Paszke et al., 2019). and The optimizer we use during training is Adam (Kingma & Ba, 2014). No specific version numbers for PyTorch or Adam are provided. |
| Experiment Setup | Yes | We summarize the values of the common hyperparameters of algorithms that are used in our experiments, as shown in Tabs. 2 and 3. and Specifically, in the Wolfpack environment, we uniformly determine the active duration by selecting a value between 25 and 35 timesteps, while the dead duration is uniformly sampled between 15 and 25 timesteps. |