First-Order Minimax Bilevel Optimization

Authors: Yifan Yang, Zhaofeng Si, Siwei Lyu, Kaiyi Ji

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our methods in two applications: the recently proposed multi-task deep AUC maximization and a novel rank-based robust meta-learning. Our methods consistently improve over existing methods with better performance over various datasets.
Researcher Affiliation Academia Yifan Yang , Zhaofeng Si , Siwei Lyu and Kaiyi Ji Department of Computer Science and Engineering University at Buffalo Buffalo, NY 14260 {yyang99, zhaofeng, siweilyu, kaiyiji}@buffalo.edu
Pseudocode Yes Algorithm 1 Fully First-Order Single-Loop Method (FOSL) and Algorithm 2 Memory-Efficient Cold-Start (Mem CS)
Open Source Code Yes We provide the code as the supplementary material.
Open Datasets Yes CIFAR100 [29], Celeb A [40], Che Xpert [26] and OGBG-Mol PCBA [25] and Mini-Image Net [53] and Tiered-Image Net [47].
Dataset Splits Yes The 100 classes are distributed among training, validation, and testing sets with a ratio of 64:16:20, respectively.
Hardware Specification Yes All experimental runs are performed using a single NVIDIA RTX 6000 GPU.
Software Dependencies No The paper mentions software components like 'Adam optimizer', 'Res Net18 architecture', 'Dense Net121 model', 'Graph Isomorphism Network (GIN)', and 'CNN4', but does not provide specific version numbers for any of them.
Experiment Setup Yes Regarding hyperparameters, we set the total training epoch to 2000 for the CIFAR100 and 100 for the OGBG-Mol PCBA datasets, adjust it to 40 for Celeb A, and reduce it to 6 for Che Xpert. The learning rate for the optimal approximator v is uniformly set to ηv = 0.1 across all experiments, with ηw = ηu = ηv/λ to maintain gradient magnitude consistency between u and v.