Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
KAN: Kolmogorov–Arnold Networks
Authors: Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljacic, Thomas Hou, Max Tegmark
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through two examples in mathematics and physics, KANs are shown to be useful collaborators helping scientists (re)discover mathematical and physical laws. Moreover, KANs are shown to be more accurate and have faster scaling laws than MLPs in function fitting and PDE solving, both theoretically and empirically. |
| Researcher Affiliation | Academia | 1 Massachusetts Institute of Technology 2 California Institute of Technology 3 Northeastern University 4 The NSF Institute for Artificial Intelligence and Fundamental Interactions |
| Pseudocode | No | The paper describes methods textually and with equations, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions PyTorch as the framework used for building codes but does not explicitly state that the code for this specific work is open-sourced or provide a repository link. |
| Open Datasets | Yes | The paper mentions using specific public datasets and benchmarks: "PDEBench Takamoto et al. (2022)", "MNIST", "Feynman datasets Udrescu & Tegmark (2020); Udrescu et al. (2020)". |
| Dataset Splits | Yes | The paper provides specific dataset split information, for example: "randomly generated 1000 training and test samples from U[-1, 1]^2" for toy function fitting, and for MNIST: "The whole training dataset (60000) and test dataset (10000) are used to evaluate train/test loss/acc." |
| Hardware Specification | Yes | All models are trained with the Adam Optimizer for 15000 steps with learning rate decay (5000 steps for learning rate 10^-3, 10^-4 and 10^-5), with batch size 1024, on a V100 GPU. |
| Software Dependencies | No | The paper mentions "Codes are built based on pytorch Paszke et al. (2019)" and "Sympy is used to compute the symbolic formula", but does not provide specific version numbers for these or other key software dependencies. |
| Experiment Setup | Yes | All models are trained with the Adam Optimizer for 15000 steps with learning rate decay (5000 steps for learning rate 10^-3, 10^-4 and 10^-5), with batch size 1024, on a V100 GPU. For PDE solving, specific parameters are mentioned: "Adam optimizers with a learning rate 10^-3 for 1000 steps except for 10000 steps for MLP (10x training)." and "α = 0.01" for loss balancing. |