Limitations of the empirical Fisher approximation for natural gradient descent

Authors: Frederik Kunstner, Philipp Hennig, Lukas Balles

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate that using the empirical Fisher can lead to highly undesirable effects; Fig. 1 shows a first example.
Researcher Affiliation Academia École Polytechnique Fédérale de Lausanne (EPFL), Switzerland1 University of Tübingen, Germany2 Max Planck Institute for Intelligent Systems, Tübingen, Germany3
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide a concrete statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes For the linear regression experiments (Fig. 1 and App. E) we use a 90%/10% train/test split. The logistic regression plots (Fig. 2) use 80%/20% train/test split. For the experiments in Fig. 3, we use an 80%/20% train/test split.
Dataset Splits Yes For the linear regression experiments (Fig. 1 and App. E) we use a 90%/10% train/test split. The logistic regression plots (Fig. 2) use 80%/20% train/test split. For the experiments in Fig. 3, we use an 80%/20% train/test split.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions 'SciPy' and 'scikit-learn' but does not provide specific version numbers for these software dependencies, which are necessary for full reproducibility.
Experiment Setup Yes All experiments use a step size of 0.1 for GD, NGD, and EFGD. For EFGD and NGD, a damping parameter of 1.0 is used.