reproducibilityindex.ai

FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning

Authors: Wenzhe Li, Zihan Ding, Seth Karten, Chi Jin

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We report experimental results using the above toolkits to serve as the baselines for two-player competitive game settings.
Researcher Affiliation	Academia	1Princeton University. Correspondence to: Wenzhe Li <wenzhe.li@princeton.edu>, Chi Jin <chij@princeton.edu>.
Pseudocode	Yes	Algorithm 1 Population-Based Methods for MGs
Open Source Code	Yes	Videos and code at https://sites.google.com/view/fightladder/home.
Open Datasets	No	The paper introduces Fight Ladder as a real-time fighting game platform and environment where observations are generated from game emulators, rather than using a pre-existing, fixed public dataset.
Dataset Splits	No	The paper discusses training steps and epochs, but does not provide explicit training, validation, and test dataset splits.
Hardware Specification	Yes	We trained all our agents on one server with 192 CPUs and 8 A6000 GPUs.
Software Dependencies	No	The paper mentions software like Gym and Stable-Baselines3, and uses ECOS, but does not provide specific version numbers for these software dependencies (e.g., 'Stable-Baselines3 version X.Y.Z').
Experiment Setup	Yes	Table 6. Training hyperparameters for PPO, which is the backbone for both single-player and two-player algorithms in the experiment. Table 7. Training hyperparameters for FSP, PSRO, and League.