Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Understanding Generalization in Physics Informed Models through Affine Variety Dimensions
Authors: Takeshi Koshizuka, Issei Sato
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also present a method to approximate this dimension and provide experimental validation of our theoretical findings. (Abstract) 5 Experiments To evaluate the generalization performance of physics-informed linear regression (PILR) compared to ridge regression (RR) using basis functions B, we conducted experiments on representative differential equations. We varied the data size n and parameter count d, and report test MSE (mean standard deviation) across 10 random initial or boundary conditions. (Section 5) |
| Researcher Affiliation | Academia | 1Department of Computer Science, The University of Tokyo 1EMAIL |
| Pseudocode | No | The paper only describes steps in regular paragraph text without structured formatting. There are no explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | We plan to release the full codebase as part of the supplementary material. The repository will include scripts and instructions to reproduce all main experiments. Since all data used in the experiments is synthetically generated, the released code also includes utilities to generate this data, ensuring full reproducibility without reliance on external datasets. (NeurIPS Paper Checklist, Open access to data and code) |
| Open Datasets | No | The analytical solution with added Gaussian noise was used as data (Appendix G.1). The data used are the numerical solutions with added Gaussian noise of variance 0.01. (Appendix G.2). Since all data used in the experiments is synthetically generated, the released code also includes utilities to generate this data, ensuring full reproducibility without reliance on external datasets. (NeurIPS Paper Checklist, Open access to data and code) |
| Dataset Splits | No | We varied the data size n and parameter count d, and report test MSE (mean standard deviation) across 10 random initial or boundary conditions. (Section 5). We have a dataset consisting of n observations, denoted as {(xi, yi)}n i=1, where xi Ω Ωrepresents the input within the domain Ω Rm or the boundary Ωand yi R represents the corresponding output. (Section 3.1). While test MSE is reported, explicit percentages or sample counts for training/validation/test splits are not provided for the synthetically generated data. |
| Hardware Specification | Yes | All experiments were conducted on a Mac Book Air equipped with an Apple M3 chip and 64 GB of unified memory. No external GPU or cluster computing resources were used. (Appendix G.1) |
| Software Dependencies | No | The hyperparameters L2 regularization weights and differential equation constraint weights ξ and ν were searched in the range [1e-9, 1e-2] using the Optuna library [4]. (Appendix G.1). While Optuna is mentioned, no version number for this library or any other software component is provided. |
| Experiment Setup | Yes | The hyperparameters L2 regularization weights and differential equation constraint weights ξ and ν were searched in the range [1e-9, 1e-2] using the Optuna library [4]. The configuration with the smallest MSE on the validation data among 100 candidates was selected. (Appendix G.1). For nonlinear equations, we train models by minimizing a soft-constrained loss using the Adam optimizer. Hyperparameters ξ and ν are tuned via validation MSE. ... For the nonlinear equations, we use the Adam optimizer with a learning rate of 1 10 2, along with an exponential learning rate scheduler. The training is performed for a maximum of 2000 epochs, utilizing an early stopping technique. (Section 5 and Appendix G.2) |