Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum
Authors: Shen Gao, Zhengliang Shi, Minghang Zhu, Bowen Fang, Xin Xin, Pengjie Ren, Zhumin Chen, Jun Ma, Zhaochun Ren
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments conducted on both controlled and real-world settings demonstrate the superiority of our tool learning framework in real-world application scenarios compared to both tuning-free (e.g., Chat GPT, Claude) and tuning-based baselines (e.g., GPT4Tools). |
| Researcher Affiliation | Academia | Shen Gao1*, Zhengliang Shi1*, Minghang Zhu1, Bowen Fang1, Xin Xin1, Pengjie Ren1, Zhumin Chen1, Jun Ma1, Zhaochun Ren2 1Shandong University, Qingdao, China 2Leiden University, Leiden, The Netherlands |
| Pseudocode | No | The paper describes its methods verbally and through equations but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a direct link to a source code repository or an explicit statement about the release of their implementation code. |
| Open Datasets | No | The paper describes constructing its own tool-use dataset by prompting Chat GPT and manually building a seed instance pool, but it does not provide concrete access information (link, DOI, specific citation) for this dataset to be publicly available. |
| Dataset Splits | No | The paper mentions using a 'training set' and 'test sets' ('Seen' and 'Unseen' each with 2,000 instances), but it does not specify the overall dataset size or explicit percentages/counts for training, validation, and testing splits, nor does it refer to predefined splits with citations. |
| Hardware Specification | Yes | The training of our model can be done within 20 hours with 4 NVIDIA A100-PCIE-80GB GPUs. |
| Software Dependencies | No | The paper mentions 'deepspeed Ze RO strategy' and 'LLa MA-7B' as the base model, but does not provide specific version numbers for software libraries or dependencies like Python, PyTorch, or DeepSpeed itself. |
| Experiment Setup | Yes | We optimize the model using deepspeed Ze RO strategy (Rasley et al. 2020) with the learning rate of 5e 5 and the weight decay coefficient of 0.01. |