Magicoder: Empowering Code Generation with OSS-Instruct
Authors: Yuxiang Wei, Zhe Wang, Jiawei Liu, Yifeng Ding, Lingming Zhang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Magicoder and Magicoder S on a wide range of coding tasks, including Human Eval (Chen et al., 2021) and MBPP (Austin et al., 2021) for Python text-to-code generation, Multi PL-E (Cassano et al., 2022) for multilingual code completion, and DS-1000 (Lai et al., 2022) for solving data science problems. |
| Researcher Affiliation | Academia | 1University of Illinois at Urbana-Champaign, USA 2Tsinghua University, China. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It shows examples of generated code and seed snippets, but these are not presented as pseudocode or formal algorithms. |
| Open Source Code | Yes | We fully open source the model weights, training data, and source code at https://github.com/ise-uiuc/ magicoder to facilitate future research. |
| Open Datasets | Yes | We directly adopt starcoderdata as our seed corpus, a filtered version of The Stack (Kocetkov et al., 2022) dataset that Star Coder is trained on |
| Dataset Splits | No | The paper does not explicitly specify a validation set split for its main model training. It mentions finetuning on 75K and 110K datasets but does not describe how data was partitioned into training, validation, and test sets for its own model development. |
| Hardware Specification | Yes | We finetune the base models for 2 epochs using two NVIDIA A100-80GB GPUs through the Distributed Data Parallel (DDP) module from Py Torch. |
| Software Dependencies | No | The paper mentions software like the "transformers library from Hugging Face" and "Py Torch" but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | We set the initial learning rate at 5e-5 with 15 warmup steps and a linear scheduler. We use Adafactor (Shazeer & Stern, 2018) as our optimizer and choose a batch size of 512 with a sequence truncation length of 1216. To obtain Magicoder S, we continue to finetune Magicoder models with the evol-codealpaca-v1 dataset, an open-source Evol-Instruct implementation containing about 110K samples. We use the same hyperparameters except for 15 warmup steps and a 1024 maximum sequence length. |