Gradient Surgery for Multi-Task Learning
Authors: Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On a series of challenging multi-task supervised and multi-task RL problems, this approach leads to substantial gains in efficiency and performance. 5 Experiments The goal of our experiments is to study the following questions: (1) Does PCGrad make the optimization problems easier for various multi-task learning problems including supervised, reinforcement, and goal-conditioned reinforcement learning settings across different task families? |
| Researcher Affiliation | Collaboration | Tianhe Yu1, Saurabh Kumar1, Abhishek Gupta2, Sergey Levine2, Karol Hausman3, Chelsea Finn1 Stanford University1, UC Berkeley2, Robotics at Google3 tianheyu@cs.stanford.edu |
| Pseudocode | Yes | Algorithm 1 PCGrad Update Rule |
| Open Source Code | Yes | Code is released at https://github.com/tianheyu927/PCGrad |
| Open Datasets | Yes | To broadly evaluate PCGrad, we consider multi-task supervised learning, multi-task RL, and goal-conditioned RL problems. We include the results on goal-conditioned RL in Appendix F. We consider multi-task supervised learning, multi-task RL, and goal-conditioned RL problems. |
| Dataset Splits | No | The paper mentions '2500 training instances and 500 test instances per task' for CIFAR-100 and 'top validation scores' for NYUv2, but does not provide explicit or complete train/validation/test split percentages or counts for all datasets used to ensure full reproducibility of data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like Adam, SGD, and SAC, but does not provide specific version numbers for any key software dependencies or libraries needed for replication. |
| Experiment Setup | Yes | During our evaluation, we tune the parameters of the baselines independently, ensuring that all methods were fairly provided with equal model and training capacity. PCGrad inherits the hyperparameters of the respective baseline method in all experiments, and has no additional hyperparameters. For more details on the experimental set-up and model architectures, see Appendix J. |