Probabilistic Margins for Instance Reweighting in Adversarial Training

Authors: qizhou wang, Feng Liu, Bo Han, Tongliang Liu, Chen Gong, Gang Niu, Mingyuan Zhou, Masashi Sugiyama

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrated that PMs are reliable and PM-based reweighting methods outperformed state-of-the-art counterparts.
Researcher Affiliation Academia 1Department of Computer Science, Hong Kong Baptist University 2De SI Lab, Australian Artificial Intelligence Institute, University of Technology Sydney 3TML Lab, School of Computer Science, Faculty of Engineering, The University of Sydney 4PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Mo E 5Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology 6RIKEN Center for Advanced Intelligence Project (AIP) 7Mc Combs School of Business, The University of Texas at Austin 8Graduate School of Frontier Sciences, The University of Tokyo
Pseudocode Yes Algorithm 1 MAIL: The Overall Algorithm.
Open Source Code Yes The source code of our paper can be found in github.com/Qizhou Wang/MAIL.
Open Datasets Yes We conducted extensive experiments on various datasets, including SVHN [32], CIFAR-10 [24], and CIFAR-100 [24].
Dataset Splits No The paper mentions using well-known datasets like CIFAR-10 and CIFAR-100, which have standard splits. However, it does not explicitly state the dataset split percentages or provide specific sample counts for training, validation, or test sets, nor does it cite the specific splits used to reproduce the data partitioning.
Hardware Specification No The paper mentions the backbone models used (Res Net-18, WRN-32-10) but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper describes algorithms and functions (e.g., 'mini-batch gradient descent', 'sigmoid function', 'Kullback-Leibler (KL) divergence') but does not specify any software libraries, frameworks, or their version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes For the considered methods, networks were trained using mini-batch gradient descent with momentum 0.9, weight decay 3.5 10 3 (for Res Net-18) / 7 10 4 (for WRN-32-10), batch size 128, and initial learning rate 0.01 (for Res Net-18) / 0.1 (for WRN-32-10) which is divided by 10 at the 75-th and 90-th epoch. ... the perturbation bound ϵ is 8/255 and the (maximal) number of PGD steps k is 10 with step size α = 2/255. Hyperparameters. The slope and bias parameters were set to 10 and 0.5 in MAIL-AT and to 2 and 0 in both MAIL-TRADES and MAIL-MART. The trade-off parameter β was set to 5 in MAIL-TRADES, and to 6 in MAIL-MART.