Understanding Black-box Predictions via Influence Functions
Authors: Pang Wei Koh, Percy Liang
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here, we empirically show that influence functions are accurate approximations (Section 4.1) that provide useful information even when these assumptions are violated (Sections 4.2, 4.3). |
| Researcher Affiliation | Academia | 1Stanford University, Stanford, CA. Correspondence to: Pang Wei Koh <pangwei@cs.stanford.edu>, Percy Liang <pliang@cs.stanford.edu>. |
| Pseudocode | No | The paper describes computational procedures and techniques in prose (e.g., in Section 3 on Efficiently Calculating Influence), but it does not include any explicitly labeled pseudocode blocks or algorithm figures. |
| Open Source Code | Yes | The code and data for replicating our experiments is available on Git Hub http://bit.ly/gt-influence and Codalab http://bit.ly/cl-influence. |
| Open Datasets | Yes | With a logistic regression model on 10-class MNIST (Le Cun et al., 1998)... dog vs. fish image classification dataset we extracted from Image Net (Russakovsky et al., 2015)... balanced training dataset of 20K diabetic patients from 100+ US hospitals... (Strack et al., 2014)... Enron1 spam dataset (Metsis et al., 2006) |
| Dataset Splits | Yes | We used 10% of the MNIST training set... with 900 training examples for each class... Enron1 spam dataset (Metsis et al., 2006), with 4,147 training and 1,035 test examples |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, processor speeds, or memory specifications used for running the experiments. It mentions training 'a convolutional neural network' which implies GPU usage, but no specific models are named. |
| Software Dependencies | No | The paper mentions using 'Tensor Flow (Abadi et al., 2015)' and 'Theano (Theano D. Team, 2016)' but does not provide specific version numbers for these software dependencies, which are required for full reproducibility. |
| Experiment Setup | Yes | We trained with L-BFGS... with L2 regularization of 0.01... training a convolutional neural network for 500k iterations... Training was done with mini-batches of 500 examples and the Adam optimizer... We set α = 0.02 and ran the attack for 100 iterations |