Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent
Authors: Jihao Andreas Lin, Javier Antorán, Shreyas Padhy, David Janz, José Miguel Hernández-Lobato, Alexander Terenin
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, stochastic gradient descent achieves state-of-the-art performance on sufficiently large-scale or ill-conditioned regression tasks. |
| Researcher Affiliation | Academia | 1University of Cambridge 2Max Planck Institute for Intelligent Systems 3University of Alberta 4Cornell University |
| Pseudocode | No | The paper describes methods mathematically and textually, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code available at: HTTPS://GITHUB.COM/CAMBRIDGE-MLG/SGD-GP. |
| Open Datasets | Yes | We consider 9 datasets from the UCI repository [16] ranging in size from N = 15k to N 2M datapoints |
| Dataset Splits | No | We report mean and standard deviation over five 90%-train 10%-test splits for the small and medium datasets, and three splits for the largest dataset. No explicit validation split percentage is provided. |
| Hardware Specification | Yes | on an RTX 2070 GPU, on an A100 GPU, on a single core of a TPUv2 device |
| Software Dependencies | No | The paper mentions software like 'JAX Sci Py module', 'GPJax', 'optax.clip_by_global_norm', and 'ANNOY', but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | For all regression experiments we use a learning rate of 0.5 to estimate the mean function representer weights, and a learning rate of 0.1 to draw samples. For Thompson sampling, we use a learning rate of 0.3 for the mean and 0.0003 for the samples. In both settings, we perform gradient clipping using optax.clip_by_global_norm with max_norm set to 0.1. |