On the Convergence of Stochastic Compositional Gradient Descent Ascent Method

Authors: Hongchang Gao, Xiaoqian Wang, Lei Luo, Xinghua Shi

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we conduct extensive experiments to demonstrate the effectiveness of our proposed method.
Researcher Affiliation Collaboration Hongchang Gao1 , Xiaoqian Wang2 , Lei Luo3 and Xinghua Shi1 1Department of Computer and Information Sciences, Temple University, PA, USA 2School of Electrical and Computer Engineering, Purdue University, IN, USA 3JD Finance America Corporation, Mountain View, CA, USA
Pseudocode Yes Algorithm 1 The Stochastic Compositional Gradient Descent Ascent Method (SCGDA)
Open Source Code No The paper does not provide concrete access to source code for the described methodology.
Open Datasets No The paper describes generating an MDP dataset: 'we generate an MDP which has 400 states and each state is associated with 10 actions. Regarding the transition probability, P π s,s is drawn from [0, 1] uniformly. Additionally, to guarantee the ergodicity, we add 10 5 to P π s,s .' However, it does not provide access information for a publicly available or open dataset.
Dataset Splits No The paper describes the generation of a dataset but does not specify training, validation, or test splits. It only mentions using the generated MDP to 'optimize Eq. (31) on this dataset'.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes In our experiments, we set the batch size to 20, α = 3, β = 10 5. Then, we verify the convergence performance of SCGDA with different learning rates η. Specifically, in Figure 1, we fix γ = λ = 0.1 and change η to show the value function gap... Furthermore, in Figure 2, we fix the learning rate η and change λ, as well as γ. Here, we set λ = γ to make the minimization subproblem and maximization subproblem update in the single-timescale manner.