reproducibilityindex.ai

On the Variance of the Fisher Information for Deep Learning

Authors: Alexander Soen, Ke Sun

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We investigate two such estimators based on two equivalent representations of the FIM both unbiased and consistent. Their estimation quality is naturally gauged by their variance given in closed form. We analyze how the parametric structure of a deep neural network can affect the variance. The meaning of this variance measure and its upper bounds are then discussed in the context of deep learning. Our central results, Theorems 4 and 6, present the variance of ˆI1(θ) and ˆI2(θ) in closed form, which is further extended to upper bounds in simpler forms.
Researcher Affiliation	Academia	Alexander Soen The Australian National University alexander.soen@anu.edu.au Ke Sun CSIRO s Data61, Australia The Australian National University sunk@ieee.org
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks. It primarily presents mathematical derivations and theoretical analyses.
Open Source Code	No	The paper does not provide any statement or link regarding the availability of open-source code for the methodology described.
Open Datasets	No	The paper focuses on theoretical analysis of statistical models and distributions (e.g., Bernoulli, Normal, Poisson) rather than conducting empirical experiments on specific, publicly available datasets. Therefore, it does not provide concrete access information for any dataset used in training or evaluation.
Dataset Splits	No	The paper is theoretical and does not conduct empirical experiments with datasets that would require explicit training, validation, or test splits.
Hardware Specification	No	The paper mentions 'modern GPUs' in a general context regarding auto-differentiation frameworks, but it does not specify any particular hardware (e.g., GPU models, CPU types, memory) used to conduct experiments within the scope of this paper.
Software Dependencies	No	The paper mentions 'AD frameworks such as PyTorch [26]' but does not specify version numbers for PyTorch or any other software dependencies, which would be necessary for reproducibility.
Experiment Setup	No	The paper is theoretical and focuses on mathematical analysis, not on empirical experimentation. Therefore, it does not provide details about an experimental setup, such as hyperparameters or system-level training settings.