Memory-Efficient Gradient Unrolling for Large-Scale Bi-level Optimization
Authors: Qianli Shen, Yezhen Wang, Zhouhao Yang, Xiang Li, Haonan Wang, Yang Zhang, Jonathan Scarlett, Zhanxing Zhu, Kenji Kawaguchi
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide a thorough convergence analysis and a comprehensive practical discussion for (FG)2U, complemented by extensive empirical evaluations, showcasing its superior performance in diverse large-scale bi-level optimization tasks. |
| Researcher Affiliation | Academia | Qianli Shen1 Yezhen Wang1 Zhouhao Yang1 Xiang Li1 Haonan Wang1 Yang Zhang1 Jonathan Scarlett1 Zhanxing Zhu2 Kenji Kawaguchi1 1National University of Singapore 2University of Southampton, UK |
| Pseudocode | Yes | A Algorithm Algorithm 1 (FG)2U: Forward Gradient Unrolling with Forward Gradient |
| Open Source Code | Yes | Code is available at https://github.com/Shen Qianli/FG2U. |
| Open Datasets | Yes | We conduct our experiments to condense the following image datasets: MNIST [35]: a handwritten digits dataset containing 60, 000 training images and 10, 000 testing images with the size of 28 28 from 10 categories. CIFAR 10/100 [31]: colored natural images datasets contraining 50, 000 training images and 10, 000 testing images from 10/100 categories, respectively. |
| Dataset Splits | Yes | MNIST [35]: a handwritten digits dataset containing 60, 000 training images and 10, 000 testing images with the size of 28 28 from 10 categories. CIFAR 10/100 [31]: colored natural images datasets contraining 50, 000 training images and 10, 000 testing images from 10/100 categories, respectively. ... We conducted our experiments following the standard data condensation setting established by [68, 77, 67]. |
| Hardware Specification | Yes | All experiments are conducted on NVIDIA-L40S (40G). ... All experiments are conducted on one NVIDIA A100 GPU (80G). |
| Software Dependencies | No | The paper mentions 'JAX [5] and Py Torch [3]' but does not provide specific version numbers for these or any other software libraries or frameworks used in the experiments. |
| Experiment Setup | Yes | The hyperparameters we used for (FG)2U are summarized in Appendix F.1. Table F.1: (FG)2U hyperparameters for data condensation experiments. ... The hyperparameters we used for (FG)2U are summarized in Appendix F.2, while all remaining hyperparameters were kept the same as in [25]. Table F.2: (FG)2U hyperparameters for Ca Me LS experiments. ... The hyperparameters we used for this experiment are summarized in Appendix F.3. Table F.3: (FG)2U hyperparameters for discovery of PDEs experiments. |