Configurable Mirror Descent: Towards a Unification of Decision Making

Authors: Pengdeng Li, Shuxin Li, Chang Yang, Xinrun Wang, Shuyue Hu, Xiao Huang, Hau Chan, Bo An

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that CMD achieves empirically competitive or better outcomes compared to baselines while providing the capability of exploring diverse dimensions of decision making.
Researcher Affiliation Collaboration 1Nanyang Technological University 2The Hong Kong Polytechnic University 3Singapore Management University (work done while at NTU) 4Shanghai Artifcial Intelligence Laboratory 5University of Nebraska-Lincoln 6Skywork AI.
Pseudocode Yes Algorithm 1 Generalized Mirror Descent (GMD)
Open Source Code Yes Code for experiments is available at https://github.com/Ipad Li/CMD.
Open Datasets Yes We curate the GAMEBENCH on top of the Open Spiel (Lanctot et al., 2019). There are 15 games which are divided into 5 categories: single-agent, cooperative multi-agent, competitive multi-agent zero-sum, competitive multi-agent general-sum, and mixed cooperative and competitive (MCC) categories.
Dataset Splits No The paper describes using various games from Open Spiel and modified versions, but it does not specify any explicit training/validation/test splits (e.g., percentages or sample counts) for these games.
Hardware Specification Yes Experiments are performed on a machine with a 24-core i9 and NVIDIA A4000.
Software Dependencies No The paper mentions software like Open Spiel for game environments and various algorithms, but it does not provide specific version numbers for general software dependencies such as programming languages, libraries, or frameworks (e.g., Python version, PyTorch version).
Experiment Setup Yes Hyper-parameters. Table 7 provides the default values of hyper-parameters used in different methods.