Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Inequality phenomenon in $l_{\infty}$-adversarial training, and its unrealized threats

Authors: Ranjie Duan, YueFeng Chen, Yao Zhu, Xiaojun Jia, Rong Zhang, Hui Xue'

ICLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We systematically evaluate such inequality phenomena by extensive experiments and find such phenomena become more obvious when performing adversarial training with increasing adversarial strength (evaluated by ϵ). To validate our hypothesis, we proposed two simple attacks that either perturb important features with noise or occlusion. Experiments show that l -adversarially trained model can be easily attacked when a few important features are influenced.
Researcher Affiliation	Collaboration	Ranjie Duan1, Yuefeng Chen1, Yao Zhu1, Xiaojun Jia1,3, Rong Zhang1 & Hue Xue1 1Alibaba Group, 2Zhejiang University, 3 University of Chinese Academy of Sciences
Pseudocode	Yes	Algorithm 1 Inductive Occlusion Attack
Open Source Code	No	We plan to open the source code to reproduce the main experimental results later.
Open Datasets	Yes	We perform a series of experiments on Image Net Deng et al. (2009) and CIFAR10 Krizhevsky et al. (2009).
Dataset Splits	No	The paper uses standard datasets like ImageNet and CIFAR10, for which standard splits are commonly implied, but it does not explicitly state the train/validation/test split percentages or sample counts within the text.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments.
Software Dependencies	No	Regarding feature attribution methods (implementation by Captum2) and footnote 2: https://github.com/pytorch/captum. It mentions "Captum" and implies "PyTorch" but does not specify their version numbers.
Experiment Setup	Yes	On evaluating regional inequality, we set the region s size as 16 16 for experiments on Image Net and 4 4 for CIFAR10. We set noise with different scales, including subpixels of 500, 1000, 5000, 10000, and 20000. We set max count as 10 and radius as 20 for occlusions... In our evaluation, we set nsamples = 20 for experiments with Image Net and nsamples = 10 for experiments with CIFAR10. We set baseline = 0 for all the experiments. For all the experiments, we set nstep = 20, and baseline N(0, 1). In our experiment, we set nsamples = 20.