Multi-Stage Influence Function

Authors: Hongge Chen, Si Si, Yang Li, Ciprian Chelba, Sanjiv Kumar, Duane Boning, Cho-Jui Hsieh

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test our proposed method in various experiments to show its effectiveness and potential applications. In this section, we will conduct experiment on real datasets in both vision and NLP tasks to show the effectiveness of our proposed method.
Researcher Affiliation Collaboration 1MIT 2 Google Research 3UCLA
Pseudocode Yes Algorithm 1: Multi-Stage Influence Score with Fixed Embedding
Open Source Code Yes Our code will be available in the Github Repository of Google Research.
Open Datasets Yes For this purpose, we build two CNN models based on CIFAR-10 and MNIST datasets. The pretrain task is training ELMo [20] model on the one-billion-word (OBW) dataset [3] which contains 30 million sentences and 8 million unique words.
Dataset Splits No The paper mentions training data and test data but does not explicitly provide details about a separate validation split, its percentages, or sample counts.
Hardware Specification Yes For example, on the CIFAR-10 dataset, the time for computing influence function with respect to all pretraining data is 230 seconds on a single Tesla V100 GPU
Software Dependencies No The paper mentions deep learning models and common tasks, implying the use of standard machine learning frameworks (e.g., TensorFlow, PyTorch), but it does not specify any software components with their version numbers.
Experiment Setup Yes The detailed hyperparameters used in these experiments are presented in Appendix B.