Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
ARMOURED: Adversarially Robust MOdels using Unlabeled data by REgularizing Diversity
Authors: Kangkang Lu, Cuong Manh Nguyen, Xun Xu, Kiran Krishnamachari, Yu Jing Goh, Chuan-Sheng Foo
ICLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 EXPERIMENTS |
| Researcher Affiliation | Academia | 1 Institute for Infocomm Research, A*STAR, Singapore 2 National University of Singapore, Singapore |
| Pseudocode | Yes | Pseudocode detailing the training procedure is provided in Algorithm 1 in Appendix A.1. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | We evaluate ARMOURED on the CIFAR-10 and SVHN datasets. We use the official train/test splits (50k/10k labeled samples) for CIFAR-10 (Krizhevsky et al., 2009) and reserve 5k samples from the training samples for a validation set. In our semi-supervised setup, the label budget is either 1k or 4k; remaining samples from training set are treated as unlabeled samples. For the SVHN dataset (Netzer et al., 2011), our train/validation/test split is 65,932 / 7,325 / 26,032 samples. |
| Dataset Splits | Yes | We use the official train/test splits (50k/10k labeled samples) for CIFAR-10 (Krizhevsky et al., 2009) and reserve 5k samples from the training samples for a validation set. In our semi-supervised setup, the label budget is either 1k or 4k; remaining samples from training set are treated as unlabeled samples. For the SVHN dataset (Netzer et al., 2011), our train/validation/test split is 65,932 / 7,325 / 26,032 samples. |
| Hardware Specification | No | The paper does not specify the exact hardware (e.g., GPU/CPU models, memory) used for running its experiments, only mentioning the use of a Wide ResNet backbone. |
| Software Dependencies | No | The paper mentions software components like 'batch normalization', 'leaky ReLU activation', and 'Adam optimizer', but it does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We train each method for 600 epochs on CIFAR-10-semi-4k and SVHN-semi-1k. Learning rate is decayed by a factor of 0.2 after the first 400k iterations. For the AT wrapper, we apply a 7-step PGD ℓ attack with total ϵ = 8/255 (for CIFAR-10), ϵ = 4/255 (for SVHN) and step size of ϵ/4. After tuning, we decide to apply (λDPP, λNEM) = (1, 1) for SVHN and (λDPP, λNEM) = (1, 0.5) for CIFAR-10. |