Limitations of the empirical Fisher approximation for natural gradient descent
Authors: Frederik Kunstner, Philipp Hennig, Lukas Balles
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate that using the empirical Fisher can lead to highly undesirable effects; Fig. 1 shows a first example. |
| Researcher Affiliation | Academia | École Polytechnique Fédérale de Lausanne (EPFL), Switzerland1 University of Tübingen, Germany2 Max Planck Institute for Intelligent Systems, Tübingen, Germany3 |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a concrete statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | For the linear regression experiments (Fig. 1 and App. E) we use a 90%/10% train/test split. The logistic regression plots (Fig. 2) use 80%/20% train/test split. For the experiments in Fig. 3, we use an 80%/20% train/test split. |
| Dataset Splits | Yes | For the linear regression experiments (Fig. 1 and App. E) we use a 90%/10% train/test split. The logistic regression plots (Fig. 2) use 80%/20% train/test split. For the experiments in Fig. 3, we use an 80%/20% train/test split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'SciPy' and 'scikit-learn' but does not provide specific version numbers for these software dependencies, which are necessary for full reproducibility. |
| Experiment Setup | Yes | All experiments use a step size of 0.1 for GD, NGD, and EFGD. For EFGD and NGD, a damping parameter of 1.0 is used. |