Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Bilevel Optimization for Adversarial Learning Problems: Sharpness, Generation, and Beyond
Authors: Risheng Liu, Zhu Liu, Weihao Mao, Wei Yao, Jin Zhang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our method improves generation quality of GANs, and consistently achieves higher accuracy for SAM under label noise and across various backbones, while promoting flatter loss landscapes. Overall, this work provides a practical and theoretically grounded framework for solving adversarial learning tasks through bilevel optimization. |
| Researcher Affiliation | Collaboration | Risheng Liu , Zhu Liu , Weihao Mao , Wei Yao , Jin Zhang School of Software Technology, Dalian University of Technology Mathematical Department, Southern University of Science and Technology National Center for Applied Mathematics Shenzhen Detection Institute for Advanced Technology Longhua-Shenzhen (DIATLHSZ) EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm for SAM In particular, we consider the SAM problem given in (3). For a fixed pair (ω, δ), the Moreau envelope reformulates the lower-level problem as the following smooth optimization problem: min θ C Lℓ(ω, θ) + 1 2γ θ δ 2, (19) which is typically convex in θ. In this case, any Karush-Kuhn-Tucker (KKT) point corresponds to a global minimizer. The global minimizer θ γ(ω, δ) of (19) satisfies the optimality condition: 0 δLℓ(ω, θ ) + 1 γ (θ δ) + N(θ , C), |
| Open Source Code | Yes | The source codes will be released at https://github.com/Liu Zhu-CV/BLOAL. |
| Open Datasets | Yes | We conduct comparison with Stacked MNIST, a challenging dataset with 1000 modes and twodimensional simulation experiments based on Gaussian distribution, generating eight distribution of 2D wheels. ... We conduct image classification experiments using the standard open-source CIFAR-10 benchmark |
| Dataset Splits | Yes | We conduct image classification experiments using the standard open-source CIFAR-10 benchmark, which consists of 50,000 training and 10,000 testing image-label pairs. |
| Hardware Specification | Yes | We conducted the experiments on a PC with Intel i5-13600KF CPU (3.5 GHz), 32GB RAM and NVIDIA RTX 4090 GPU. |
| Software Dependencies | No | We leveraged the Py Torch framework on the 64-bit Linux system. |
| Experiment Setup | Yes | As for the first case, we set η, β, α, γ, µ, and p as 0.001, 0.01, 0.0001, 20, and 0.1 and leverage ω ω 1e 4 as the stop criterion. SGD optimizer is used for the update of ω. We set the the maximum steps of optimization as 1000 uniformly. ... The hyperparameters η, β, α, γ, µ, and p are set to 0.005, 0.005, 0.01, 100, 5 and 0.1, respectively. ... The hyperparameters α, γ, µ, Q, and p are set to 0.05, 1 10 4, 0.75, 1 and 0.01, respectively. Following the setup in [2], we apply basic augmentation during training, including horizontal flipping, four-pixel padding, and cropping. Models are trained from scratch for 200 epochs using a batch size of 128 and a cosine learning rate schedule. |