Logically Consistent Adversarial Attacks for Soft Theorem Provers

Authors: Alexander Gaskell, Yishu Miao, Francesca Toni, Lucia Specia

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our framework successfully generates adversarial attacks and identifies global weaknesses common across multiple target models. Our analyses reveal naive heuristics and vulnerabilities in these models reasoning capabilities, exposing an incomplete grasp of logical deduction under logic programs. Finally, in addition to effective probing of these models, we show that training on the generated samples improves the target model s performance.
Researcher Affiliation Collaboration Alexander Gaskell1,2 , Yishu Miao1,2 , Francesca Toni1 and Lucia Specia1 1Imperial College London 2Byte Dance
Pseudocode No The paper describes algorithms and processes in text but does not include formal pseudocode blocks or sections labeled "Algorithm."
Open Source Code Yes Our implementation is available at https://github.com/alexgaskell10/LAVA.
Open Datasets Yes All results are for the Rule Takers test set (20,192 samples), using the validation set for early stopping.
Dataset Splits Yes All results are for the Rule Takers test set (20,192 samples), using the validation set for early stopping.
Hardware Specification Yes The attacker is trained for five epochs with a batch size of 8 on a single 11Gb NVIDIA Ge Force RTX 2080 GPU.
Software Dependencies No The paper mentions using ROBERTA-large and Prob Log but does not specify their version numbers or any other software dependencies with specific versions (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes The attacker is trained for five epochs with a batch size of 8 on a single 11Gb NVIDIA Ge Force RTX 2080 GPU. The learning rate was set to 5e 6, with 1e 5 and 2.5e 6 also tested.