Learning A Minimax Optimizer: A Pilot Study

Authors: Jiayi Shen, Xiaohan Chen, Howard Heaton, Tianlong Chen, Jialin Liu, Wotao Yin, Zhangyang Wang

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical experiments on a variety of minimax problems corroborate the effectiveness of Twin-L2O. We benchmark our algorithms on several testbed problems and compare against state-of-the-art minimax solvers.
Researcher Affiliation Collaboration Jiayi Shen1*, Xiaohan Chen2*, Howard Heaton3*, Tianlong Chen2, Jialin Liu4 , Wotao Yin3,4 , Zhangyang Wang2 1Texas A&M University, 2University of Texas at Austin, 3University of California, Los Angeles, 4Alibaba US, Damo Academy
Pseudocode Yes The full method is outlined in the Method 1, where the L2O update is denoted by LSTM(uk; φk) and the fallback method is a Halpern iteration (Halpern, 1967).
Open Source Code Yes The code is available at: https://github.com/VITA-Group/L2O-Minimax.
Open Datasets No The paper describes generating instances for specific problem formulations (Seesaw, Rotated Saddle, Matrix Game) by sampling parameters and initializing variables, rather than using a pre-existing publicly available dataset. For example: "we use 128 optimizee instances for training; each of them has its parameters i.i.d. sampled, and variables x, y randomly initialized by i.i.d. sampling from U[ 0.5, 0.5]."
Dataset Splits Yes A validation set of 20 optimizees is used with parameters and variables sampled in the same way; and similarly we generate a hold-out testing set of another 100 instances.
Hardware Specification Yes All experiments in this and following sections are conducted using the Ge Force GTX 1080 Ti GPUs.
Software Dependencies No The paper mentions software components like LSTM and Adam optimizer but does not specify version numbers for any libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used for implementation.
Experiment Setup Yes The L2O training routine follows (Andrychowicz et al., 2016): we use 128 optimizee instances for training; each of them has its parameters i.i.d. sampled, and variables x, y randomly initialized by i.i.d. sampling from U[ 0.5, 0.5]. For each epoch, an L2O optimizer will update the optimizee parameters for 1000 iterations, with its unrolling length T = 10. When the next epoch starts, all x, y as well as LSTM hidden states are reset. We train the L2O solvers for 200 epochs, using Adam with a constant learning rate 10 4.