Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Thompson Sampling in Function Spaces via Neural Operators

Authors: Rafael Oliveira, Xuesong Wang, Kian Ming A Chai, Edwin V. Bonilla

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments benchmark our method against other Bayesian optimization baselines on functional optimization tasks involving partial differential equations of physical systems, demonstrating better sample efficiency and significant performance gains.
Researcher Affiliation	Academia	Rafael Oliveira CSIRO s Data61 Sydney, Australia Xuesong Wang CSIRO s Data61 Sydney, Australia Kian Ming A. Chai DSO National Laboratories Singapore Edwin V. Bonilla CSIRO s Data61 Sydney, Australia
Pseudocode	Yes	Algorithm 1: GP-TS Input: Search space S, initial data D0 for t {1, . . . , T} do Sample gt GP(µt 1, kt 1) Select xt argmaxx X gt(x) Query yt = f(xt) + ϵt Update Dt = Dt 1 {xt, yt} Algorithm 2: NOTS (ours) Input: Search space S, initial data D0 for t = 1, . . . , T do θt = argminθ ℓt(θ), θt,0 N(0, Σ0) at argmaxa S f(Gθt(a)) yt = G (at) + ξt Dt = Dt 1 {at, yt}
Open Source Code	Yes	Code for our experiments will be made available online.4 4Code repository: https://github.com/csiro-funml/nots
Open Datasets	Yes	We evaluate our NOTS algorithm on two popular PDE benchmark problems: Darcy flow and a shallow water model. Our results are compared against a series of representative Bayesian optimization and neural Thompson sampling baselines. More details about our implementations and further experiment details can be found in Appendix D. ... Darcy flow models fluid pressure in a porous medium [28] ... Shallow water models capture the time evolution of fluid mass and discharge on a rotating sphere [46].
Dataset Splits	No	The paper describes generating data for training the neural operator surrogate models but does not provide explicit training/test/validation splits for the optimization experiments themselves. For instance, for Darcy flow: "To train Gθ, we generate 1,000 input output pairs via a finite-difference solver at 16 16 resolution." For shallow water: "We train Gθ on 200 random initial conditions on a 32 64 equiangular grid, using a 1,200 s timestep to simulate up to τ = 6 hours." The optimization process involves sequential querying, not predefined splits.
Hardware Specification	Yes	NOTS was implemented using the Neural Operator library [47] and run on NVIDIA H100 GPUs on CSIRO s high-performance computing cluster.
Software Dependencies	No	The paper mentions "NOTS was implemented using the Neural Operator library [47]" and "found within modern deep learning frameworks, such as Py Torch [58]". However, it does not specify version numbers for these software components.
Experiment Setup	Yes	For all experiments, we trained the model for 10 epochs of mini-batch stochastic gradient descent with an initial learning rate of 10 3 and a cosine annealing scheduler. The regularization factor for the L2 penalty was set as λ := 10 4. This same setting for the regularization factor was also applied to our implementation of STO-NTS.