Robust and Faster Zeroth-Order Minimax Optimization: Complexity and Applications

Authors: Weixin An, Yuanyuan Liu, Fanhua Shang, Hongying Liu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, ZO-GDEGA can generate more effective poisoning attack data with an average accuracy reduction of 5%. The improved AUC performance also verifies the robustness of gradient estimations.
Researcher Affiliation Academia 1Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, China 2College of Intelligence and Computing, Tianjin University, China 3Medical School, Tianjin University, China 4Peng Cheng Lab, Shenzhen, China weixinanut@163.com, yyliu@xidian.edu.cn, fhshang@tju.edu.cn, hyliu2009@tju.edu.cn
Pseudocode Yes Algorithm 1 Deterministic Zeroth-Order Gradient Descent Extragradient Ascent Algorithm
Open Source Code Yes Our codes are available: https://github.com/Weixin-An/ZO-GDEGA.
Open Datasets Yes The epsilon_test dataset3: It contains 100,000 samples of 2,000 dimensions, and we also split it into 70% training samples and 30% testing samples.
Dataset Splits No The paper specifies 70% training and 30% test samples for both synthetic and epsilon_test datasets, but does not explicitly mention a separate validation split or its size.
Hardware Specification No The paper mentions 'CPU time (seconds)' in the experimental figures but does not specify any particular hardware models (e.g., CPU, GPU models, or memory specifications) used for the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We set rx = 2, µ1 = µ2 = 2 10 5 and poisoning ratio |Dtr,p|/|Dtr| = 0.1, and choose mini-batch size b1 = b2 = 100 and b1 = b2 = 10 for the synthetic and epsilon_test datasets, respectively. ... We choose a two-layer MLP as the classification model, and set mini-batch b1 = b2 = 256, q1 = q2 = 10, rx = ry = 2 and step sizes ηx = ηy = 0.1 to train all the methods for 200 epochs.