Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

NUTS: Eddy-Robust Reconstruction of Surface Ocean Nutrients via Two-Scale Modeling

Authors: Hao Zheng, Shiyu Liang, Yuting Zheng, Chaofan Sun, LEI BAI, Enhui Liao

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we answer the following research questions: RQ1. How does NUTS perform in reconstructing global surface ocean nutrient concentrations compared to existing baselines, using both simulated and real-world observations? ... NUTS outperforms all data-driven baselines in global reconstruction and achieves site-wise accuracy comparable to numerical models. On real observations, NUTS reduces NRMSE by 79.9% for phosphate and 19.3% for nitrate over the best baseline. Ablation studies validate the effectiveness of each module. We compare our model against a wide range of baselines grouped into six categories: ... We use Normalized Root Mean Squared Error (NRMSE) to evaluate model performance
Researcher Affiliation	Collaboration	1Shanghai Jiao Tong University, China 2Shanghai Artificial Intelligence Laboratory, China
Pseudocode	No	The paper describes the methodology in text and mathematical formulations (e.g., Lw,η[φ] = φ + (wφ) η 2φ = s, Equation 2). It does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code and data are available at URL. We provide code and data in this anonymized URL.
Open Datasets	Yes	To support high-quality long-term reconstruction, we release two data products generated by the numerical physical-biogeochemical model MOM6-COBALT2 Liu et al. [2022], referred to as MOM6 (Daily) and MOM6 (Monthly). We use in-situ nutrient measurements from the World Ocean Database (WOD) Mishonov et al. [2024], which contains nitrate and phosphate records from 1959 to 2022. We release two datasets under the CC-BY 4.0 licenses and code implementation under the MIT license. Datasets and code can be found in this anonymized URL.
Dataset Splits	Yes	Table 2: Overview of dataset divisions by year. Task Train Validation Test Daily Avg. 2019, 2020 2021 2022 Monthly Avg. 1959 1998 1999 2010 2011 2022
Hardware Specification	Yes	The simulations were conducted on 1000 CPU cores of AMD EPYC 9654 96-Core Processors over an 11-day period... We provide the compute resources used to generate the simulation data in Section 4, and the compute resources used to conduct experiments in Appendix D.3. (From Appendix D.3: All experiments were conducted on a single NVIDIA A100 GPU (80GB) with 10 Intel Xeon Platinum 8358 CPUs (2.60GHz) and 256GB RAM.)
Software Dependencies	Yes	Our implementation uses PyTorch 2.0 with CUDA 11.8. (From Appendix D.3)
Experiment Setup	Yes	All experiments use the default setting: coarse/refine depth of 12/6, all loss terms and inputs included, and interval length set to 4.