reproducibilityindex.ai

Learning to Optimize Differentiable Games

Authors: Xuxi Chen, Nelson Vadori, Tianlong Chen, Zhangyang Wang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On test problems including quadratic games and GANs, L2PG can substantially accelerate the convergence, and demonstrates a remarkably more stable trajectory. Codes are available at https: //github.com/VITA-Group/L2PG.
Researcher Affiliation	Collaboration	1University of Texas at Austin 2J.P. Morgan AI Research.
Pseudocode	Yes	A.1. Algorithms We provide a summary to L2PG s pipeline in Algorithm 1. Algorithm 1 L2PG
Open Source Code	Yes	Codes are available at https: //github.com/VITA-Group/L2PG.
Open Datasets	Yes	A.2. Sampled Game Coefficients As aforementioned, we have sampled a fixed evaluation set and two testing sets of quadratic games. The coefficients of the 60 games are provided in the three files: evaluation.txt , test stable.txt and test unstable.txt . Each line in the file represents a game, containing 6 numbers that represent M11, M22, M12, M21, b1, b2, respectively.
Dataset Splits	Yes	We evaluate the L2O optimizer on a fixed set of game instances with the same type (i.e., quadratic or GANs) every 5 epochs, and the optimizer with the highest evaluation performance will be used at the meta-testing stage.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory, cloud instances) used for running the experiments.
Software Dependencies	No	The paper mentions optimizers like 'Adam optimizer' and 'RMSprop optimizer' and model components like 'LSTM network', but it does not specify version numbers for any software libraries, frameworks, or programming languages used.
Experiment Setup	Yes	We use an LSTM optimizer with a hidden dimension of 32 in all experiments. A detailed explanation of the structure of the L2O optimizer can be found in Section B. The unroll length (i.e., the value of T) is set to 10. We batch-ify the training process by simultaneously training on 128 different games, and we train the optimizer for 300 epochs. The number of training iterations in each epoch takes values from {50, 100, 200, 500, 1000} increasingly if the Training-CL technique is applied, otherwise we set the number of training iterations to be 100. We train the parameters in L2PG (i.e., ϕ) by using the Adam optimizer (Kingma & Ba, 2014), with an initial learning rate of 1 10 3. We decay the learning rate by 10 every 1/3 of the total number of training epochs.