Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

End-to-End Learning of LTLf Formulae by Faithful LTLf Encoding

Authors: Hai Wan, Pingjia Liang, Jianfeng Du, Weilin Luo, Rongzhen Ye, Bo Peng

AAAI 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that our approach achieves state-of-the-art performance with up to 7% improvement in accuracy, highlighting the bene๏ฌts of introducing the faithful LTLf encoding.
Researcher Affiliation Collaboration 1School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, P.R.China 2Guangdong University of Foreign Studies, Guangzhou 510006, P.R.China 3Bigmath Technology, Shenzhen 518063, P.R.China
Pseudocode Yes Algorithm 1: Interpreting LTLf Formula
Open Source Code Yes All the proofs of lemmas/theorems are provided in the technical report available at https://github.com/a79461378945/TLTLf.git.
Open Datasets Yes We reused the datasets that are provided by (Luo et al. 2022).
Dataset Splits No For each dataset, there is a formula with kf operators, and there are 250/250 positive/negative traces for this formula constituting the training set and 500/500 positive/negative traces for this formula constituting the test set. The paper specifies training and test sets but does not mention a separate validation set.
Hardware Specification Yes All experiments were conducted on a Linux system equipped with an Intel(R) Xeon(R) Gold 6248R processor with 3.0 GHz and 126 GB RAM.
Software Dependencies No The paper mentions using 'Adam (Kingma and Ba 2015) to optimize the parameters in our model' but does not specify version numbers for Adam or any other software or libraries used in the implementation.
Experiment Setup Yes Settings. All experiments were conducted on a Linux system equipped with an Intel(R) Xeon(R) Gold 6248R processor with 3.0 GHz and 126 GB RAM. The time limit is set to 1 hour and the memory limit set to 10 GB for each instance.