Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On the Certified Robustness for Ensemble Models and Beyond
Authors: Zhuolin Yang, Linyi Li, Xiaojun Xu, Bhavya Kailkhura, Tao Xie, Bo Li
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on a wide range of datasets including MNIST, CIFAR-10, and Image Net. The experimental results show that DRT can achieve significantly higher certified robustness compared to baselines with similar training cost as training a single model. |
| Researcher Affiliation | Academia | Zhuolin Yang1 Linyi Li1 Xiaojun Xu1 Bhavya Kailkhura2 Tao Xie3 Bo Li1 1University of Illinois Urbana-Champaign 2Lawrence Livermore National Laboratory 3Peking University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It describes the methods in narrative text and uses diagrams. |
| Open Source Code | Yes | Finally, we upload the source code as the supplementary material for reproducibility purpose. |
| Open Datasets | Yes | We conduct extensive experiments on a wide range of datasets including MNIST (Le Cun et al., 2010), CIFAR-10 (Krizhevsky, 2012), and Image Net (Deng et al., 2009). |
| Dataset Splits | Yes | On Image Net we evaluated every 100-th image in the validation set, for 500 images total. |
| Hardware Specification | Yes | The evaluation is on single NVIDIA Ge Force GTX 1080 Ti GPU. |
| Software Dependencies | No | The paper mentions software components like "SGD-momentum", "cross-entropy loss", and "PGD attack" but does not provide specific version numbers for any software libraries or frameworks (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | For the training optimizer, we use the SGD-momentum with the initial learning rate α = 0.01. The learning rate is decayed for every 30 epochs with decay ratio γ = 0.1 and the batch size equals to 256. |