Scaling Symbolic Methods using Gradients for Neural Model Explanation
Authors: Subham Sekhar Sahoo, Subhashini Venugopalan, Li Li, Rishabh Singh, Patrick Riley
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our technique on three datasets MNIST, Image Net, and Beer Reviews, and demonstrate both quantitatively and qualitatively that the regions generated by our approach are sparser and achieve higher saliency scores compared to the gradient-based methods alone. |
| Researcher Affiliation | Industry | Google Research {subhamsahoo,vsubhashini,leeley,rising,pfr}@google.com |
| Pseudocode | No | The paper describes the methodology using mathematical equations and descriptions but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and examples are at - https://github.com/google-research/google-research/tree/master/smug_saliency |
| Open Datasets | Yes | Datasets. We empirically evaluate SMUG on two image datasets, MNIST (Le Cun et al., 2010), and Image Net (Deng et al., 2009), as well as a text dataset of Beer Reviews from (Mc Auley et al., 2012). |
| Dataset Splits | Yes | Beer Reviews: To evaluate SMUG on a textual task we consider the review rating prediction task on the Beer Reviews dataset2 consisting of 70k training examples, 3k validation and 7k test examples. Image Net: We use 3304 images (224 224) with ground truth bounding boxes from the validation set of Image Net. MNIST: For 100 images chosen randomly from the validation set, the SMT solver could solve the constraint shown in Eq. 6 (returns SAT) for only 41 of the images. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions 'z3 solver' as a tool used but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | We use a feedforward model consisting of one hidden layer with 32 nodes (Re LU activation) and 10 output nodes with sigmoid, one each for 10 digits (0 9). ... we set k = 3000, γ = 0 for Image Net, and k = 100, γ = 0 for text experiments... Further, each masking variable Mij is used to represent a 4 4 grid of pixels instead of a single pixel (to reduce running time). |