Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

PhysPDE: Rethinking PDE Discovery and a Physical HYpothesis Selection Benchmark

Authors: Mingquan Feng, Yixin Huang, Yizhou Liu, Bofang Jiang, Junchi Yan

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results on our newly designed Fluid Mechanics and Laser Fusion datasets demonstrate the interpretability and feasibility of our method.
Researcher Affiliation Academia 1Sch. of Computer Science & Sch. of Artificial Intelligence, Shanghai Jiao Tong University 2Sch. of Physics and Astronomy, Shanghai Jiao Tong University EMAIL
Pseudocode Yes The main algorithm is summarized in Alg. 1, and we will elaborate on its details below. Algorithm 1: HSTS+STOP Algorithm Input :Decision Forest, Operator Library, Observed Data Param :(HSTS) loss bound ℓmax,min, risk fraction ϵ, max episode ME, number of rollout NR, exploration weight Param :(STOP) Regulariazation λ , threshold T, max iteration Output :Optimal decision h, coefficients θh Output :(Optionally) new expression m, coefficients θm 1 Initialize h as empty vector, initialize an empty cache; 2 for each episode ME do 3 if h has no child then 5 for each rollout NR do 6 Selection: h0 UCT selection of object J(Eq. 14) among descendants of h; 7 Expansion: h0 Repeatedly select a child of h0 until no child available; 8 Simulation: Calculate reward(Eq.12) of h0 by solving Eq.15 via STOP; 9 Backpropagation:Propagate and cache the reward; 10 Action: h the child maximizing J(Eq. 14); 11 Return best solution in the cache;
Open Source Code Yes Source code and benchmarks are publicly available. https://github.com/Feng Mingquan-sjtu/Phys PDE
Open Datasets No The experimental results on our newly designed Fluid Mechanics and Laser Fusion datasets demonstrate the interpretability and feasibility of our method. Source code and benchmarks are publicly available.
Dataset Splits Yes For each scenario, datasets for training and testing are generated under varying boundary conditions, ensuring identical sizes. ... divided the dataset along the temporal axis into training, validation, and test sets. ... We perform 5-fold cross-validation on the training set for performance estimation
Hardware Specification No Computations are executed on a remote server using 10 CPU cores concurrently.
Software Dependencies Yes The optimization algorithm is SLSQP implemented in Sci Py (Virtanen et al., 2020)
Experiment Setup Yes Loss The pooling size is 5. The coefficients are λEq = 10 7, λθh = 10 5, λh = 10 5, λm = 10 5. HSTS For S1 and S2, the maximum number of rollout is 30, the ϵ of risk-seeking object is 0.5; while in the S3, the the maximum number of rollout is 60 and ϵ = 0.05. The bound lmin = 0.0001, lmax = 1. The exploration weight in the UCT is 1. STOP We set the maximum iteration depth at 3, the l2 regularization λθm = 0.01, and the weight tolerance T = 0.005.