Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Certified Robustness for Deep Equilibrium Models via Interval Bound Propagation
Authors: Colin Wei, J Zico Kolter
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical comparison reveals that models with IBP-Mon DEQ layers can achieve comparable ℓ8 certified robustness to similarly-sized fully explicit networks.1 Our experiments demonstrate that IBP-Mon DEQ layers are competitive with standard explicit layers for ℓ8-certified robustness. |
| Researcher Affiliation | Collaboration | Colin Wei Stanford University EMAIL J. Zico Kolter CMU and Bosch Center for AI EMAIL |
| Pseudocode | No | The paper does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Code is available here: https://github.com/cwein3/ibp-mondeq-code. |
| Open Datasets | Yes | Tables 2 shows certified and standard classification errors of 3 and 7 layer models trained with IBP on the MNIST and CIFAR10 datasets for various values of ϵ. |
| Dataset Splits | Yes | For CIFAR10, we choose ϵtrain ϵtest, whereas for MNIST we use a larger value of ϵtrain, following Gowal et al. (2018) and Shi et al. (2021). The values are displayed in Table 3. |
| Hardware Specification | Yes | All models besides the DEQ-3 can be trained within a day on a single NVIDIA Titan Xp GPU. |
| Software Dependencies | No | The paper mentions 'Adam optimizer (Kingma & Ba, 2014)' but does not provide specific version numbers for software dependencies such as Python, PyTorch, or other libraries. |
| Experiment Setup | Yes | We train using IBP with the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 5e-4, and report errors at the last epoch of training averaged over 3 runs. ... We use a batch size of 128 and anneal the learning rate, which is initially 5e-4, by a factor of 0.2 at certain steps. ... We use gradient clipping with a max ℓ2 norm of 10. |