A Regularized Opponent Model with Maximum Entropy Objective

Authors: Zheng Tian, Ying Wen, Zhichen Gong, Faiz Punakkath, Shihao Zou, Jun Wang

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate these two algorithms on the challenging iterated matrix game and differential game respectively and show that they can outperform strong MARL baselines.
Researcher Affiliation Academia Zheng Tian1 , Ying Wen1 , Zhichen Gong1 , Faiz Punakkath1 , Shihao Zou2 and Jun Wang 1 1University College London 2University of Alberta
Pseudocode Yes We list the pseudo-code of ROMMEO-Q and ROMMEO-AC in Appendix A.
Open Source Code Yes The experiment code and appendix are available at https://github.com/rommeoijcai2019/rommeo.
Open Datasets No The paper uses environments like 'iterated matrix games' and 'differential Max of Two Quadratic Game' rather than pre-existing datasets with explicit public access information like links or formal citations.
Dataset Splits No The paper describes experiments in reinforcement learning environments (iterated matrix games, differential games) where data is generated through interaction, and therefore explicit train/validation/test dataset splits are not described.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies or version numbers for replication.
Experiment Setup No The paper mentions general aspects of the experimental setup like the number of episodes and steps, and discusses exploration control, but does not provide specific numerical hyperparameters (e.g., learning rate, batch size, optimizer details) for replication.