The Expected Loss of Preconditioned Langevin Dynamics Reveals the Hessian Rank
Authors: Amitay Bar, Rotem Mulayoff, Tomer Michaeli, Ronen Talmon
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate our theory on linear and nonlinear NNs. We show that the expected loss of the networks incorporated into preconditioned LD with specific preconditioning results in an accurate estimation of the Hessian rank. |
| Researcher Affiliation | Academia | Technion Israel Institute of Technology, Haifa, Israel amitayb@campus.technion.ac.il |
| Pseudocode | Yes | Algorithm 1: Hessian rank estimation |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | Yes | We train a Dn CNN (Zhang et al. 2017) for denoising on the MNIST dataset. |
| Dataset Splits | No | The paper does not explicitly provide details about training/validation/test dataset splits. It mentions 'Ktot = 1.5 104 and Kavg = 104 iterations' for computing averaged loss, which refers to iterations of the rank estimation algorithm, not dataset splits. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used (e.g., GPU models, CPU models, memory) to run its experiments. |
| Software Dependencies | No | The paper mentions software components like 'SGD' and 'Adam' but does not provide specific version numbers for these or other libraries/frameworks used. |
| Experiment Setup | Yes | We set G = I, η = 10 4 and σ2 n = 2 10 5. [...] We follow Algorithm 1 with Ktot = 1.5 104 and Kavg = 104 iterations, noise power of σ2 = 2 10 5, and the stepsize is η = 10 4. [...] Ktot = 30 103 iterations with stepsize η = 0.1, and σ2 = 10 5. The last Kavg = 104 iterations are used to compute the averaged loss. |