Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning
Authors: Hang Zhou, Yehui Tang, Haochen Qin, Yujie Yang, Renren Jin, Deyi Xiong, Kai Han, Yunhe Wang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical studies, including instruction tuning experiments with models such as Pythia and LLa MA, demonstrate the effectiveness of the proposed framework. |
| Researcher Affiliation | Collaboration | Hang Zhou1,2, Yehui Tang2, Haochen Qin2, Yujie Yang2, Renren Jin1, Deyi Xiong1 , Kai Han2 , Yunhe Wang2 1College of Intelligence and Computing, Tianjin University, Tianjin, China. 2Huawei Noah s Ark Lab. |
| Pseudocode | No | The paper includes a diagram (Figure 1) but does not provide any pseudocode or algorithm blocks. |
| Open Source Code | No | Codes will be released soon |
| Open Datasets | Yes | In alignment with the Wizard LM [44], we adopted the Supervised Fine-Tuning (SFT) dataset, designated as the Evol-Instruct dataset, which consists of 70,000 instruction-response pairs. [...] For further enriching our comparative analysis, we employed the Alpaca dataset [32], comprising 52,000 instruction-following samples. |
| Dataset Splits | No | The paper mentions using the Evol-Instruct and Alpaca datasets for fine-tuning but does not explicitly provide details on how these datasets were split into training, validation, and test sets for their experiments, or if specific predefined splits were used for validation. |
| Hardware Specification | No | The paper does not specify the exact hardware used for running the experiments (e.g., specific GPU models, CPU types, or memory sizes). Appendix A.5 discusses computational load of LLM agents, not the experimental hardware. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' and 'Fast-Chat [54]' and 'GPT-4' but does not provide specific version numbers for these or any other software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | We fine-tuned our models (Pythia-1B and Llama-2-7B) over three epochs using the Adam optimizer, with an initial learning rate of 2 × 10−5, a maximum token count of 2048, and a batch size of 64. |