Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Logic Distillation: Learning from Code Function by Function for Decision-making Tasks
Authors: Dong Chen, Shilin Zhang, Fei Gao, Yueting Zhuang, Siliang Tang, Qidong Liu, Mingliang Xu
IJCAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that with the assistance of LD, S-LLMs can achieve outstanding results in continuous decision-making tasks, comparable to, or even surpassing, those of L-LLMs. The code and data for the proposed method are provided for research purposes https://github.com/Anfeather/Logic-Distillation. |
| Researcher Affiliation | Academia | 1The School of Computer and Artificial Intelligence of Zhengzhou University 2Engineering Research Center of Intelligent Swarm Systems, Ministry of Education 3National Supercomputing Center In Zhengzhou 4Zhejiang University EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Logic Distillation Input: rules (instructions) x. Parameter: L-LLMs pθL, S-LLMs pθS, retriever pθR. Output: the decision-making outcome [o1, o2, ] . 1: Generate functions f and corresponding user manual u with L-LLMs by Equation 1. 2: Building function base Df with f and u. 3: Initialize O, s. 4: while Decision-making output O of one step does not meet the task requirements do 5: while j in 1, 2, , J do 6: Retrieve top-K functions [f1, , f K] with pθR, x and s by Equation 2. 7: S-LLMs select the most suitable function fj from [f1, , f K] for stage j. 8: Obtaining intermediate results oj by Equation 4. 9: end while 10: O, s = o J 11: if emergencies then 12: Generate functions f E by Equation 6 and add f E into [f1, , f K]. 13: end if 14: end while |
| Open Source Code | Yes | The code and data for the proposed method are provided for research purposes https://github.com/Anfeather/Logic-Distillation. |
| Open Datasets | Yes | The code and data for the proposed method are provided for research purposes https://github.com/Anfeather/Logic-Distillation. |
| Dataset Splits | No | To perform KD, we initialize 221 sets of starting positions randomly and produce 103,355 sets of outputs with the L-LLM. Subsequently, we fine-tune the S-LLM with Lo RA [Hu et al., 2021] based on these outputs. ... All methods are tested on 200 sets of starting positions. |
| Hardware Specification | No | On the other hand, numerous companies have attempted to develop relatively smaller open-source LLMs, including GLM4-9B [GLM et al., 2024] and LLa MA-7B [Touvron et al., 2023], which are compatible with consumer-grade GPUs like RTX 3090 Ti. In this paper, we refer to LLMs that cannot be deployed on most devices and require invocation through a paid interface as larger LLMs (L-LLMs), in contrast to smaller LLMs (S-LLMs) deployable on consumergrade GPUs. |
| Software Dependencies | No | To perform KD, we initialize 221 sets of starting positions randomly and produce 103,355 sets of outputs with the L-LLM. Subsequently, we fine-tune the S-LLM with Lo RA [Hu et al., 2021] based on these outputs. |
| Experiment Setup | No | More specifically, the pursuit game involves two sides, each controlled by a different LLM. One LLM manages three blue dots, while the other one controls an orange dot. Each interaction between the two sides constitutes a step. In each iteration, the blue dots are constrained to move by two units, while the orange dot is restricted to a single unit of movement. The game concludes when the Manhattan distance between all three blue dots and the orange dot is less than 2 units. ... if LLMs make more than seven illegal choices, the game is considered a failure. The upper limit for the number of moves in the game is capped at 100. |