Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Axial Neural Networks for Dimension-Free Foundation Models
Authors: Hyunsu Kim, Jonggeon Park, Joan Bruna, Hongseok Yang, Juho Lee
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the resulting models in three different settings: training a single PDE from scratch, pretraining with multiple PDEs, and fine-tuning on a single PDE. We show that our variants perform competitively with their original counterparts. We also conduct experiments to demonstrate the unseen-dimension generalization ability of XNNs, which plays an important role in such a dimension-agnostic strategy. |
| Researcher Affiliation | Academia | KAIST, New York University |
| Pseudocode | No | The paper describes mathematical formulations and architectural components with equations and figures, but it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | The implemented architectures are summarized in https://github.com/kim-hyunsu/XNN. |
| Open Datasets | Yes | The data are sourced from widely used PDE solution benchmark datasets: PDEBench [33] and PDEArena [16]. |
| Dataset Splits | Yes | The dataset was then divided into training and validation subsets with an 80/20 split. Train/Val/Test Splits: X%/10%/10% split on each dataset at the trajectory level, where X denotes a subsample from 80% of the total dataset |
| Hardware Specification | Yes | All experiments were conducted on NVIDIA GPUs: RTX 3090, RTX A6000, and RTX 5090. |
| Software Dependencies | Yes | Training and evaluation were implemented in JAX 0.4.30 [5] and Flax 0.8.5 [20], with Py Torch 2.7.0+cu118 [29] used primarily for data preprocessing and batching. We implemented CVi T and X-CVi T using JAX 0.4.30 [5] and FLAX 0.8.5 [20], while MPP and X-MPP were implemented using Py Torch 2.1.0+cu121 [29]. For the RTX 5090 machine, we used Py Torch 2.8.0+cu128 due to CUDA driver compatibility. |
| Experiment Setup | Yes | All models were trained using the Adam optimizer, with a fixed learning rate of 0.001 and a batch size of 64. Training was conducted for 10 epochs for each model. The loss function used was binary cross-entropy computed from the sigmoid of the output logits. Table 4: Hyperparameters for each model architecture Table 5: Hyperparameters in the CVi T training. Table 6: Hyperparameters in the MPP from-scratch training and pretraining. Table 7: Hyperparameters in the MPP finetuning. |