Memory-Efficient Gradient Unrolling for Large-Scale Bi-level Optimization

Authors: Qianli Shen, Yezhen Wang, Zhouhao Yang, Xiang Li, Haonan Wang, Yang Zhang, Jonathan Scarlett, Zhanxing Zhu, Kenji Kawaguchi

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide a thorough convergence analysis and a comprehensive practical discussion for (FG)2U, complemented by extensive empirical evaluations, showcasing its superior performance in diverse large-scale bi-level optimization tasks.
Researcher Affiliation Academia Qianli Shen1 Yezhen Wang1 Zhouhao Yang1 Xiang Li1 Haonan Wang1 Yang Zhang1 Jonathan Scarlett1 Zhanxing Zhu2 Kenji Kawaguchi1 1National University of Singapore 2University of Southampton, UK
Pseudocode Yes A Algorithm Algorithm 1 (FG)2U: Forward Gradient Unrolling with Forward Gradient
Open Source Code Yes Code is available at https://github.com/Shen Qianli/FG2U.
Open Datasets Yes We conduct our experiments to condense the following image datasets: MNIST [35]: a handwritten digits dataset containing 60, 000 training images and 10, 000 testing images with the size of 28 28 from 10 categories. CIFAR 10/100 [31]: colored natural images datasets contraining 50, 000 training images and 10, 000 testing images from 10/100 categories, respectively.
Dataset Splits Yes MNIST [35]: a handwritten digits dataset containing 60, 000 training images and 10, 000 testing images with the size of 28 28 from 10 categories. CIFAR 10/100 [31]: colored natural images datasets contraining 50, 000 training images and 10, 000 testing images from 10/100 categories, respectively. ... We conducted our experiments following the standard data condensation setting established by [68, 77, 67].
Hardware Specification Yes All experiments are conducted on NVIDIA-L40S (40G). ... All experiments are conducted on one NVIDIA A100 GPU (80G).
Software Dependencies No The paper mentions 'JAX [5] and Py Torch [3]' but does not provide specific version numbers for these or any other software libraries or frameworks used in the experiments.
Experiment Setup Yes The hyperparameters we used for (FG)2U are summarized in Appendix F.1. Table F.1: (FG)2U hyperparameters for data condensation experiments. ... The hyperparameters we used for (FG)2U are summarized in Appendix F.2, while all remaining hyperparameters were kept the same as in [25]. Table F.2: (FG)2U hyperparameters for Ca Me LS experiments. ... The hyperparameters we used for this experiment are summarized in Appendix F.3. Table F.3: (FG)2U hyperparameters for discovery of PDEs experiments.