Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
FAME: Adaptive Functional Attention with Expert Routing for Function-on-Function Regression
Authors: Yifei Gao, Yong Chen, Chen Zhang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on synthetic and real-world functional regression benchmarks show that FAME achieves state-of-the-art accuracy and strong robustness to arbitrarily sampled discrete observations of functions. To evaluate the proposed model, we benchmark FAME against a wide range of state-of-the-art methods for Fo FR (see Section 2). ... We evaluate FAME and the baselines on both synthetic datasets and several real-world datasets, using mean-squared error (MSE) as the primary evaluation metric. |
| Researcher Affiliation | Academia | Yifei Gao Department of Industrial Engineering Tsinghua University Beijing 100084, China EMAIL Yong Chen Department of Industrial and Systems Engineering University of Iowa Iowa City, IA 52242, USA EMAIL Chen Zhang Department of Industrial Engineering Tsinghua University Beijing 100084, China EMAIL |
| Pseudocode | No | The paper describes the methodology using mathematical equations and descriptive text, but it does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | All experiments use publicly available datasets, and every model configuration and training detail is listed in Section 5 and Appendix B.2. Any researcher familiar with Py Torch can reproduce our results from this information alone. If appropriate, we would be happy to share a Git Hub link to our implementation alongside the camera-ready version of the paper. |
| Open Datasets | Yes | Real-world data. We evaluate the same set of models on three public datasets: 1) Hawaii Ocean, which contains five hydrographic depth profiles temperature, salinity, oxygen, chloropigment, and density among which different variables are treated as regression targets in turn, with the remaining serving as input functions; 2) Human3.6M, a human motion capture dataset consisting of 3-D joint trajectories, where we define three action-specific regression tasks (Walking, Eating, and Sitting); and 3) ETT-small, a monthly electricity-transformer dataset used to forecast oil temperature from transformer load curves. Full preprocessing details and task definitions are provided in Appendix B. |
| Dataset Splits | Yes | Each dataset is randomly split into 80% training and 20% test instances; for the synthetic benchmark we repeat this split five times and report the average performance. |
| Hardware Specification | Yes | Appendix B.2 specifies that all runs were executed on a laptop with an AMD R9-7940HS CPU, 16 GB RAM, and a single NVIDIA RTX 3090 GPU;since both our model and datasets are relatively lightweight, runtime was not a limiting factor and thus not a primary concern in our evaluation. |
| Software Dependencies | No | Any researcher familiar with Py Torch can reproduce our results from this information alone. Unless otherwise stated, every model is trained for 100 epochs with the Adam optimiser, an initial learning rate of 1 10 3, and a dropout rate of 0.2. No specific version numbers for software libraries or dependencies are provided. |
| Experiment Setup | Yes | Unless otherwise stated, every model is trained for 100 epochs with the Adam optimiser, an initial learning rate of 1 10 3, and a dropout rate of 0.2. Each dataset is randomly split into 80% training and 20% test instances... The input lifting network ξθ and the neural vector field fθ are both implemented as two layer MLPs with hidden widths 32 and 64, respectively, followed by Tanh activations. The same MLP configuration is used for the Mo E expert fields and for the decoder vector field fψ, ensuring architectural consistency throughout the model. |