reproducibilityindex.ai

Gradients of Functions of Large Matrices

Authors: Nicholas Krämer, Pablo Moreno-Muñoz, Hrittik Roy, Søren Hauberg

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Next, we put this code to the test on three challenging machine-learning problems centred around functions of matrices to see how it fares against state-ofthe-art differentiable implementations of exact Gaussian processes (Section 5), differential equation solvers (Section 6), and Bayesian neural networks (Section 7).
Researcher Affiliation	Academia	Nicholas Kr amer, Pablo Moreno-Mu noz, Hrittik Roy, Søren Hauberg Technical University of Denmark Kongens Lyngby, Denmark {pekra, pabmo, hroy, sohau}@dtu.dk
Pseudocode	Yes	Algorithm E.1 (Arnoldi s forward pass; paraphrased) ... Algorithm E.2 (Arnoldi s adjoint pass; paraphrased) ... Algorithm E.3 (Forward pass) ... Algorithm E.4 (Backward pass)
Open Source Code	Yes	Find the code at https://github.com/pnkraemer/experiments-lanczos-adjoints and install the library with pip install matfree.
Open Datasets	Yes	Data For the experiments we use the Protein , KEGG (undirected , KEGG (directed) , Elevators , and Kin40k datasets (Table 7, adapted from Bartels et al. [95]). All are part of the UCI data repository, and accessible through there.
Dataset Splits	No	The paper mentions an "80/20 train/test split" in Table 3 caption, but does not explicitly provide information on a validation split or how it was used for hyperparameter tuning.
Hardware Specification	Yes	The Gaussian process and differential equation case studies run on a V100 GPU, the Bayesian neural network one on a P100 GPU.
Software Dependencies	No	The paper mentions software like JAX, Diffrax, GPy Torch, PyTorch, and Ke Ops, but it does not provide specific version numbers for multiple key software components or for self-contained solvers, which is required for reproducibility.
Experiment Setup	Yes	We calibrate a Mat ern prior with smoothness ν = 1.5, using 10 matrix-vector products per Lanczos iteration, conjugate gradients tolerance of ϵ = 1, a rank-15 pivoted Cholesky preconditioner, and 10 Rademacher samples... All parameters are initialised randomly. We use the Adam optimiser with learning rate 0.05 for 75 epochs. All experiments are repeated for three different seeds.