Online Symbolic Regression with Informative Query

Authors: Pengwei Jin, Di Huang, Rui Zhang, Xing Hu, Ziyuan Nan, Zidong Du, Qi Guo, Yunji Chen

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through comprehensive experiments, we show that QUOSR can facilitate modern symbolic regression methods by generating informative data.
Researcher Affiliation Collaboration 1 State Key Lab of Processors, Institute of Computing Technology, CAS 2 University of Chinese Academy of Sciences 3 Cambricon Technologies
Pseudocode Yes Algorithm 1: The query-based framework
Open Source Code No The paper does not include an explicit statement about releasing source code or a direct link to a code repository for the described methodology.
Open Datasets Yes We train QUOSR using expressions in the dataset of Valipour et al. (2021) which contains about 500k one-variable expressions.
Dataset Splits Yes For each expression, 30 data points sampled uniformly in the range of [ 3.0, 3.0] are used to generate expressions, and another 30 data points sampled uniformly in the range of [ 5.0, 3.0] [3.0, 5.0] are used to evaluate the predicted expression.
Hardware Specification No The paper does not specify the exact hardware (e.g., GPU/CPU models, memory, or specific computing clusters) used for running the experiments.
Software Dependencies No The paper mentions software components like 'Symbolic GPT' and 'Point Net' but does not provide specific version numbers for any software, libraries, or frameworks used in the experiments.
Experiment Setup Yes The max query times K is set to 9, thus we first uniformly sample 3 points and then generate the other 27 data points in the following 9 query steps. We limit QUOSR to generate values of x [ 3.0, 3.0] to fit the range of the original dataset. For each expression, 30 data points sampled uniformly in the range of [ 3.0, 3.0] are used to generate expressions, and another 30 data points sampled uniformly in the range of [ 5.0, 3.0] [3.0, 5.0] are used to evaluate the predicted expression.