Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
End-to-End Learning of LTLf Formulae by Faithful LTLf Encoding
Authors: Hai Wan, Pingjia Liang, Jianfeng Du, Weilin Luo, Rongzhen Ye, Bo Peng
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that our approach achieves state-of-the-art performance with up to 7% improvement in accuracy, highlighting the bene๏ฌts of introducing the faithful LTLf encoding. |
| Researcher Affiliation | Collaboration | 1School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, P.R.China 2Guangdong University of Foreign Studies, Guangzhou 510006, P.R.China 3Bigmath Technology, Shenzhen 518063, P.R.China |
| Pseudocode | Yes | Algorithm 1: Interpreting LTLf Formula |
| Open Source Code | Yes | All the proofs of lemmas/theorems are provided in the technical report available at https://github.com/a79461378945/TLTLf.git. |
| Open Datasets | Yes | We reused the datasets that are provided by (Luo et al. 2022). |
| Dataset Splits | No | For each dataset, there is a formula with kf operators, and there are 250/250 positive/negative traces for this formula constituting the training set and 500/500 positive/negative traces for this formula constituting the test set. The paper specifies training and test sets but does not mention a separate validation set. |
| Hardware Specification | Yes | All experiments were conducted on a Linux system equipped with an Intel(R) Xeon(R) Gold 6248R processor with 3.0 GHz and 126 GB RAM. |
| Software Dependencies | No | The paper mentions using 'Adam (Kingma and Ba 2015) to optimize the parameters in our model' but does not specify version numbers for Adam or any other software or libraries used in the implementation. |
| Experiment Setup | Yes | Settings. All experiments were conducted on a Linux system equipped with an Intel(R) Xeon(R) Gold 6248R processor with 3.0 GHz and 126 GB RAM. The time limit is set to 1 hour and the memory limit set to 10 GB for each instance. |