LIFT: Language-Interfaced Fine-Tuning for Non-language Machine Learning Tasks
Authors: Tuan Dinh, Yuchen Zeng, Ruisu Zhang, Ziqian Lin, Michael Gira, Shashank Rajput, Jy-yong Sohn, Dimitris Papailiopoulos, Kangwook Lee
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To answer this, we propose Language-Interfaced Fine-Tuning (LIFT) and study its efficacy and limitations by conducting an extensive empirical study on a suite of non-language classification and regression tasks. |
| Researcher Affiliation | Academia | University of Wisconsin-Madison, USA |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/UW-Madison-Lee-Lab/ Language Interfaced Fine Tuning. |
| Open Datasets | Yes | For classification, we use three types of non-language data: low-dimensional synthetic datasets, real tabular datasets in Open ML [36], and vision datasets (MNIST [37], Fashion-MNIST [38] and their permuted variants [39])....We also use four real datasets: Medical Insurance (Insurance) [41], Combined Cycle Power Plant (CCPP) [42], Servo [43], and Student Performance (Student) [44]. |
| Dataset Splits | Yes | For hyperparameter selection, we apply the grid search on a set of parameters values and use cross-validation on the training set (see details in Appendix C.2). |
| Hardware Specification | Yes | For experiments on GPT-J, we used p3.8xlarge and p3.2xlarge instances from AWS and RTX3090 GPUs in the local server. |
| Software Dependencies | No | The paper mentions using 'Lo RA' and the 'Open AI API' but does not specify version numbers for these or any other software dependencies in the main text. |
| Experiment Setup | Yes | We use the default cross-entropy loss for token prediction in LMs. Our generic template (without feature names and task description) for sample r is When we have x1=r.x1, x2=r.x2, ..., xp=r.xp, what should be y? | {z } question ### | {z } q/a separator y = r.y | {z } @@@ | {z } end of answer if r has p attributes. ... For hyperparameter selection, we apply the grid search on a set of parameters values and use cross-validation on the training set (see details in Appendix C.2). ... we adjust the generation randomness by increasing the decoding temperature [33, 34, 35] from 0 (deterministic mode) to 0.75 (random mode). |