reproducibilityindex.ai

Multi-Stage Influence Function

Authors: Hongge Chen, Si Si, Yang Li, Ciprian Chelba, Sanjiv Kumar, Duane Boning, Cho-Jui Hsieh

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test our proposed method in various experiments to show its effectiveness and potential applications. In this section, we will conduct experiment on real datasets in both vision and NLP tasks to show the effectiveness of our proposed method.
Researcher Affiliation	Collaboration	1MIT 2 Google Research 3UCLA
Pseudocode	Yes	Algorithm 1: Multi-Stage Inﬂuence Score with Fixed Embedding
Open Source Code	Yes	Our code will be available in the Github Repository of Google Research.
Open Datasets	Yes	For this purpose, we build two CNN models based on CIFAR-10 and MNIST datasets. The pretrain task is training ELMo [20] model on the one-billion-word (OBW) dataset [3] which contains 30 million sentences and 8 million unique words.
Dataset Splits	No	The paper mentions training data and test data but does not explicitly provide details about a separate validation split, its percentages, or sample counts.
Hardware Specification	Yes	For example, on the CIFAR-10 dataset, the time for computing inﬂuence function with respect to all pretraining data is 230 seconds on a single Tesla V100 GPU
Software Dependencies	No	The paper mentions deep learning models and common tasks, implying the use of standard machine learning frameworks (e.g., TensorFlow, PyTorch), but it does not specify any software components with their version numbers.
Experiment Setup	Yes	The detailed hyperparameters used in these experiments are presented in Appendix B.