reproducibilityindex.ai

Magicoder: Empowering Code Generation with OSS-Instruct

Authors: Yuxiang Wei, Zhe Wang, Jiawei Liu, Yifeng Ding, Lingming Zhang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Magicoder and Magicoder S on a wide range of coding tasks, including Human Eval (Chen et al., 2021) and MBPP (Austin et al., 2021) for Python text-to-code generation, Multi PL-E (Cassano et al., 2022) for multilingual code completion, and DS-1000 (Lai et al., 2022) for solving data science problems.
Researcher Affiliation	Academia	1University of Illinois at Urbana-Champaign, USA 2Tsinghua University, China.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. It shows examples of generated code and seed snippets, but these are not presented as pseudocode or formal algorithms.
Open Source Code	Yes	We fully open source the model weights, training data, and source code at https://github.com/ise-uiuc/ magicoder to facilitate future research.
Open Datasets	Yes	We directly adopt starcoderdata as our seed corpus, a filtered version of The Stack (Kocetkov et al., 2022) dataset that Star Coder is trained on
Dataset Splits	No	The paper does not explicitly specify a validation set split for its main model training. It mentions finetuning on 75K and 110K datasets but does not describe how data was partitioned into training, validation, and test sets for its own model development.
Hardware Specification	Yes	We finetune the base models for 2 epochs using two NVIDIA A100-80GB GPUs through the Distributed Data Parallel (DDP) module from Py Torch.
Software Dependencies	No	The paper mentions software like the "transformers library from Hugging Face" and "Py Torch" but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	We set the initial learning rate at 5e-5 with 15 warmup steps and a linear scheduler. We use Adafactor (Shazeer & Stern, 2018) as our optimizer and choose a batch size of 512 with a sequence truncation length of 1216. To obtain Magicoder S, we continue to finetune Magicoder models with the evol-codealpaca-v1 dataset, an open-source Evol-Instruct implementation containing about 110K samples. We use the same hyperparameters except for 15 warmup steps and a 1024 maximum sequence length.