Achieving Domain-Independent Certified Robustness via Knowledge Continuity

Authors: Alan Sun, Chiyu Ma, Kenneth Ge, Soroush Vosoughi

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, to complement our theoretical results, we present several applications of knowledge continuity such as regularization, a certification algorithm, and show that knowledge continuity can be used to localize vulnerable components of a neural network. Unless otherwise specified, we run all of our experiments on the IMDB dataset [48] (a sentiment classification task) using a host of language models from different model families (encoder, decoder, encoder-decoder). We also present additional experiments on vision tasks.
Researcher Affiliation Academia 1Carnegie Mellon University, 2Dartmouth College
Pseudocode Yes Algorithm 1 A Monte-Carlo algorithm for estimating π‘˜-volatility of some metric decomposable function 𝑓with 𝑛hidden layers (left). Augmenting any loss function to regularize π‘˜-volatility (right), given some Beta distribution parameterized by 𝛼, 𝛽and regularization strength πœ† 0.
Open Source Code Yes Codebase for our experiments can be found at https://github.com/alansun17904/kc. The rest of our codebase including implementations of the algorithms and figures described in the manuscript can be found at https://github.com/alansun17904/kc.
Open Datasets Yes Unless otherwise specified, we run all of our experiments on the IMDB dataset [48] (a sentiment classification task). The IMDB dataset consist of 50,000 examples with 25,000 for training and 25,000 for testing.
Dataset Splits Yes We split the test set 40%-60% to create a validation and test set of 10,000 and 15,000 examples, respectively.
Hardware Specification Yes All of our experiments were conducted on four NVIDIA RTX A6000 GPUs as well as four NVIDIA Quadro RTX 6000 GPUs.
Software Dependencies No The paper does not provide specific version numbers for software libraries or dependencies, only general mentions of tools or methods.
Experiment Setup Yes We train all models using the hyperparameter and optimizer configurations shown in Table 4. Hyperparameter Value Optimizer Adam Adam 𝛽1 0.9 Adam 𝛽2 0.999 Adam πœ– 1 10 8 Max Gradient Norm 1.0 Learning Rate Scheduler Linear Epochs 20 Batch Size 32 Learning Rate 5 10 5 Weight Decay 1 10 9