Improving Certified Robustness via Statistical Learning with Logical Reasoning

Authors: Zhuolin Yang, Zhikuan Zhao, Boxin Wang, Jiawei Zhang, Linyi Li, Hengzhi Pei, Bojan Karlaš, Ji Liu, Heng Guo, Ce Zhang, Bo Li

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we conduct extensive experiments on five datasets including both high-dimensional images and natural language texts, and we show that the certified robustness with knowledge-based logical reasoning indeed significantly outperforms that of the state-of-the-arts. 5 Experiments We conduct intensive experiments on five datasets to evaluate the certified robustness of the sensing-reasoning pipeline.
Researcher Affiliation Collaboration Zhuolin Yang UIUC zhuolin5@illinois.edu Zhikuan Zhao ETH Zürich zhikuan.zhao@inf.ethz.ch Boxin Wang UIUC boxinw2@illinois.edu Jiawei Zhang UIUC jiaweiz7@illinois.edu Linyi Li UIUC linyi2@illinois.edu Hengzhi Pei UIUC hpei4@illinois.edu Bojan Karlaš ETH Zürich karlasb@inf.ethz.ch Ji Liu Kwai Inc. ji.liu.uwisc@gmail.com Heng Guo University of Edinburgh hguo@inf.ed.ac.uk Ce Zhang ETH Zürich ce.zhang@inf.ethz.ch Bo Li UIUC lbo@illinois.edu
Pseudocode Yes Algorithm 1 Algorithms for MLN robustness upper bound (algorithm of lower bound is similar)
Open Source Code Yes The code is provided at https://github.com/Sensing-Reasoning/ Sensing-Reasoning-Pipeline.
Open Datasets Yes We conduct intensive experiments on five datasets... for road sign classification task, we follow [17] and use the same dataset GTSRB [44]... For the information extraction task, we use the High Tech dataset which consists of both daily closing asset price and financial news from 2006 to 2013 [12]. Also: We only use public and commonly used data.
Dataset Splits Yes It consists of 14880 training samples, 972 validation samples, and 3888 testing samples.
Hardware Specification No The provided paper text does not include specific hardware details such as GPU models, CPU types, or cloud instance specifications. It only mentions that 'The detailed information is mentioned in Appendix D.', but Appendix D is not part of the provided text.
Software Dependencies No The paper mentions software components like BERT, SGD-momentum, and Adam optimizer, but does not provide specific version numbers for these or any other software dependencies, making it difficult to reproduce the exact software environment.
Experiment Setup Yes We use the SGD-momentum with the initial learning rate as 0.01 and the weight decay parameter as 10 4 to train all the sensors for 50000 iterations with 200 as the batch size, following [17]. To fine-tune the BERT classifiers for three information tasks, we use the Adam optimizer with the initial learning rate as 10 5 and the weight decay parameter as 10 4. We train all the sensors for 30 epochs, and the batch size 32.