Towards Verified Robustness under Text Deletion Interventions

Authors: Johannes Welbl, Po-Sen Huang, Robert Stanforth, Sven Gowal, Krishnamurthy (Dj) Dvijotham, Martin Szummer, Pushmeet Kohli

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments on the SNLI and MNLI datasets, we observe that IBP training leads to a significantly improved verified accuracy.
Researcher Affiliation Collaboration Deep Mind, London, UK University College London, UK
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes Experiments are conducted on two large-scale NLI datasets: SNLI (Bowman et al., 2015) and multi NLI (Williams et al., 2018), henceforth MNLI.
Dataset Splits Yes For SNLI we use standard dataset splits, tuning hyperparameters on the development set and reporting results for the test set. For MNLI we split off 2000 samples from the development set for validation purposes and use the remaining samples as test set.
Hardware Specification No Table 2 mentions '1 GPU' but does not specify any particular GPU model, CPU, memory, or other detailed hardware specifications.
Software Dependencies No The paper mentions 'Adam optimiser' but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We tune the scale of the respective contribution in [0.01, 0.1, 1.0, 10.0, 100.0]. All experiments used a learning rate of 0.001, Adam optimiser, and batch size 128. We perform early stopping with respect to verified accuracy, for a maximum of 3M training steps.