The Expected Loss of Preconditioned Langevin Dynamics Reveals the Hessian Rank

Authors: Amitay Bar, Rotem Mulayoff, Tomer Michaeli, Ronen Talmon

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically demonstrate our theory on linear and nonlinear NNs. We show that the expected loss of the networks incorporated into preconditioned LD with specific preconditioning results in an accurate estimation of the Hessian rank.
Researcher Affiliation Academia Technion Israel Institute of Technology, Haifa, Israel amitayb@campus.technion.ac.il
Pseudocode Yes Algorithm 1: Hessian rank estimation
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology.
Open Datasets Yes We train a Dn CNN (Zhang et al. 2017) for denoising on the MNIST dataset.
Dataset Splits No The paper does not explicitly provide details about training/validation/test dataset splits. It mentions 'Ktot = 1.5 104 and Kavg = 104 iterations' for computing averaged loss, which refers to iterations of the rank estimation algorithm, not dataset splits.
Hardware Specification No The paper does not explicitly describe the specific hardware used (e.g., GPU models, CPU models, memory) to run its experiments.
Software Dependencies No The paper mentions software components like 'SGD' and 'Adam' but does not provide specific version numbers for these or other libraries/frameworks used.
Experiment Setup Yes We set G = I, η = 10 4 and σ2 n = 2 10 5. [...] We follow Algorithm 1 with Ktot = 1.5 104 and Kavg = 104 iterations, noise power of σ2 = 2 10 5, and the stepsize is η = 10 4. [...] Ktot = 30 103 iterations with stepsize η = 0.1, and σ2 = 10 5. The last Kavg = 104 iterations are used to compute the averaged loss.