GO4Align: Group Optimization for Multi-Task Alignment
Authors: Jiayi Shen, Qi Wang, Zehao Xiao, Nanne van Noord, Marcel Worring
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experimental results on diverse benchmarks demonstrate our method s performance superiority with even lower computational costs. |
| Researcher Affiliation | Academia | 1University of Amsterdam, Amsterdam, the Netherlands 2Department of Automation, Tsinghua University, Beijing, China |
| Pseudocode | Yes | The pseudo-code of GO4Align is provided in Appendix A. |
| Open Source Code | Yes | We provide the code for our method to encourage follow-up work.3 https://github.com/autumn9999/GO4Align |
| Open Datasets | Yes | We conduct experiments on four benchmarks commonly used in multi-task optimization literature [6, 13, 14, 16]: NYUv2 [34], City Scapes [35], QM9 [36], and Celeb A [37]. |
| Dataset Splits | Yes | This work follows the same experimental setting used in Nash MTL [13] and FAMO [14], including the dataset partition for training, validation, and testing. Table 6: Benchmark partition for training, validation, and testing. NYUv2 1449 795 N/A 654 City Scapes 3475 2975 N/A 500 QM9 130k 110k 10k 10k Celeb A 202,599 162,770 19,867 19,962 |
| Hardware Specification | Yes | We conduct all experiments on a single NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions using 'Py Torch Geometric' and 'Adam as the optimizer', but it does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We train each method for 200 epochs with an initial learning rate of 1e 4 and reduce the learning rate to 5e 5 after 100 epochs. The architecture is Multi-Task Attention Network (MTAN) [6] built upon Seg Net [45]. Batch sizes for NYUv2 and City Scapes are set as 2 and 8 respectively. We train each method for 300 epochs with a batch size of 120 and search for the best learning rate in {1e 3,5e 4,1e 4}. We take Reduce On Plateau [13] as the learning-rate scheduler to decrease the lr once the validation overall performance stops improving. We train the model for 15 epochs with a batch size of 256. We adopt Adam as the optimizer with a fixed learning rate of 1e 3. |