reproducibilityindex.ai

Conflict-Averse Gradient Descent for Multi-task learning

Authors: Bo Liu, Xingchao Liu, Xiaojie Jin, Peter Stone, Qiang Liu

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On a series of challenging multi-task supervised learning and reinforcement learning tasks, CAGrad achieves improved performance over prior state-of-the-art multi-objective gradient manipulation methods. Code is available at https://github.com/Cranial-XIX/CAGrad.
Researcher Affiliation	Collaboration	Bo Liu, Xingchao Liu, Xiaojie Jin, , Peter Stone, Qiang Liu The University of Texas at Austin, Sony AI, Bytedance Research {bliu,xcliu,pstone,lqiang}@cs.utexas.edu, xjjin0731@gmail.com
Pseudocode	Yes	Algorithm 1 Conflict-averse Gradient Descent (CAGrad) for Multi-task Learning
Open Source Code	Yes	Code is available at https://github.com/Cranial-XIX/CAGrad.
Open Datasets	Yes	To answer questions (1) and (2), we create a toy optimization example to evaluate the convergence of CAGrad compared to MGDA and PCGrad. On the same toy example, we ablate over the constant c and show that CAGrad recovers GD and MGDA with proper c values. Next, to test CAGrad on more complicated neural models, we perform the same set of experiments on the Multi-Fashion+MNIST benchmark [19] with a shrinked Le Net architecture [18] (in which each layer has a reduced number of neurons compared to the original Le Net). Please refer to Appendix B for more details.
Dataset Splits	Yes	10% of the training images is held out as the validation set.
Hardware Specification	Yes	All experiments are run on a single NVIDIA V100 GPU.
Software Dependencies	No	The paper mentions software like Adam optimizer and Soft Actor-Critic (SAC) but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	We consider a shrinked Le Net as our model, and train it with Adam [16] optimizer with a 0.001 learning rate for 50 epochs using a batch size of 256.