Auxiliary Learning as an Asymmetric Bargaining Game

Authors: Aviv Shamsian, Aviv Navon, Neta Glazer, Kenji Kawaguchi, Gal Chechik, Ethan Fetaya

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we evaluate Auxi Nash on multiple multi-task benchmarks and find that it consistently outperforms competing methods.
Researcher Affiliation Collaboration 1Bar-Ilan University, Ramat Gan, Israel 2Aiola, Herzliya, Israel 3National University of Singapore 4Nvidia, Tel-Aviv, Israel.
Pseudocode Yes Algorithm 1 Auxi Nash
Open Source Code Yes To encourage future research and reproducibility, we make our source code publicly available https://github.com/Aviv Sham/ auxinash.
Open Datasets Yes We follow the setup from Liu et al. (2019b; 2022); Navon et al. (2022) and evaluate Auxi Nash on the NYUv2 and Cityscapes datasets (Silberman et al., 2012; Cordts et al., 2016). We use CIFAR-10 dataset to form 3 tasks. We evaluate Auxi Nash on the speech commands (SC) dataset (Warden, 2018).
Dataset Splits Yes We use the CIFAR10 dataset and divide the dataset to train/val/test splits each containing 45K/5K/10K respectively. We allocate 5K/10K samples as our labeled dataset for the supervised main task. For validation and test we used the original dataset split containing 3643 and 4107 samples respectively.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU specifications, or cloud computing instances.
Software Dependencies No The paper mentions using optimizers like Adam and models like SegNet and WRN, but it does not specify the version numbers of any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used.
Experiment Setup Yes Unless stated otherwise for Auxi Nash we update the preference vector p every 25 optimization steps using SGD optimizer with momentum of 0.9 and learning rate of 5e 3. We use 1000 training and train the model using Adam optimizer and learning-rate 1e 2 and batch-size of 256 for 1000 epochs. We train WRN for 50K iterations using Adam optimizer with learning rate of 5e 3 and 256 batch size. We train the model for 200 epochs with Adam optimizer, and a learning rate of 1e 3. We use learning rate of 1e 4 for the first 100 epochs, then reduced it to 5e 5 for the remaining epochs. We use a batch size of 2 and 8 for NYUv2 and City Scapes respectively.