A Bayesian Approach to Data Point Selection
Authors: XINNUO XU, Minyoung Kim, Royson Lee, Brais Martinez, Timothy Hospedales
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through controlled experiments in both the vision and language domains, we present the proof-of-concept. Additionally, we demonstrate that our method scales effectively to large language models and facilitates automated per-task optimization for instruction fine-tuning datasets. |
| Researcher Affiliation | Collaboration | Xinnuo Xu Microsoft Research Cambridge xinnuoxu@microsoft.com Minyoung Kim Samsung AI Center Cambridge, UK mikim21@gmail.com Royson Lee Samsung AI Center Cambridge, UK royson.lee@samsung.com Brais Martinez Samsung AI Center Cambridge, UK brais.mart@samsung.com Timothy Hospedales Samsung AI Center Cambridge, UK University of Edinburgh, UK t.hospedales@ed.ac.uk |
| Pseudocode | No | While the paper presents update equations (6) and (7) that describe an algorithmic process, these are not formatted as a distinct 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | The code for this paper is available at https://github.com/Xinnuo Xu/BADS. |
| Open Datasets | Yes | Following the setup in [41], we use the standard MNIST handwritten digit classification dataset [33] to create a class-imbalanced binary classification task. ... Our experiment utilizes the standard CIFAR 10-class classification dataset [30]. ... The English benchmark introduced in Web NLG 2020 [6]... We use the same IFT data as [57, 51] as our train set Dt, which is a mix of FLAN V2 [35], COT [54], DOLLY [10], and OPEN ASSISTANT 1 [31]. Following [57, 5], we focus on four downstream tasks: MMLU [24], which consists of multiple-choice questions across 57 sub-tasks, ARC-challenge/-easy [9], and Hella Swag [61]. |
| Dataset Splits | Yes | A total of 5,000 images from classes 4 and 9 were selected as the train set Dt... A balanced meta set Dm is created by selecting another 25 examples from each of these two classes, ensuring no overlap between Dt and Dm. ... we first create a clean and balanced meta set Dm by randomly sampling 1000 examples from each class in the training data. ... create a single clean and balanced meta set Dm by randomly sampling 30 examples from the Web NLG 2020 validation set in each test domain. ... 5 examples were selected from each sub-task to create the meta set Dm for MMLU, totaling 285 examples. Additionally, following [5], for the other tasks, we randomly chose 25 examples from their validation set to create the respective meta sets. |
| Hardware Specification | Yes | The training is conducted on a single GPU... Training is performed on a single GPU... Training is on a single GPU... Training uses one A40 GPU... The offline scoring in Ask LLM-O takes around four hours in our setup on a single NVIDIA A40 GPU. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as deep learning frameworks (e.g., PyTorch, TensorFlow) or programming languages (e.g., Python). |
| Experiment Setup | Yes | The training is conducted on a single GPU, using SGD with a fixed learning rate of 1e-3 and a mini-batch size of 100, over a total of 15,000 steps. ... The learning rate for the weight network is 1e-3 and the target sparsity level β is 0.005. Other hyperparameters, including those in the baselines, are detailed in Table 3 (Appendix E). |