Rethinking Influence Functions of Neural Networks in the Over-Parameterized Regime

Authors: Rui Zhang, Shihua Zhang9082-9090

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments on real-world data confirm our theoretical results and demonstrate our findings.
Researcher Affiliation Academia 1NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China 2School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China {rayzhang, zsh}@amss.ac.cn
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper does not state that source code for their methodology is released or provide a link to it.
Open Datasets Yes In particular, we evaluate our method on MNIST (Lecun et al. 1998) and CIFAR-10 (Krizhevsky and Hinton 2009)
Dataset Splits Yes In particular, we evaluate our method on MNIST (Lecun et al. 1998) and CIFAR-10 (Krizhevsky and Hinton 2009) for two-layer Re LU neural networks with the width from 104 to 8 104 respectively.
Hardware Specification No The paper does not specify the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions "NEURAL TANGENTS" as a tool used by others, but it does not provide specific version numbers for any software dependencies used in their experiments.
Experiment Setup Yes We train the neural networks through gradient descent on the regularized mean square error loss function as follows: ... + λ 2 W W(0) 2 F ... We initialize the parameters randomly as follows: wr(0) N 0, κ2Id , ar(0) unif({ 1, 1}), r [m], where 0 < κ 1 controls the magnitude of initialization, and all randomnesses are independent. For simplicity, we fix the second layer a and only update the first layer W during training.