Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
LIFT: Language-Interfaced Fine-Tuning for Non-language Machine Learning Tasks
Authors: Tuan Dinh, Yuchen Zeng, Ruisu Zhang, Ziqian Lin, Michael Gira, Shashank Rajput, Jy-yong Sohn, Dimitris Papailiopoulos, Kangwook Lee
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To answer this, we propose Language-Interfaced Fine-Tuning (LIFT) and study its efficacy and limitations by conducting an extensive empirical study on a suite of non-language classification and regression tasks. |
| Researcher Affiliation | Academia | University of Wisconsin-Madison, USA |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/UW-Madison-Lee-Lab/ Language Interfaced Fine Tuning. |
| Open Datasets | Yes | For classification, we use three types of non-language data: low-dimensional synthetic datasets, real tabular datasets in Open ML [36], and vision datasets (MNIST [37], Fashion-MNIST [38] and their permuted variants [39])....We also use four real datasets: Medical Insurance (Insurance) [41], Combined Cycle Power Plant (CCPP) [42], Servo [43], and Student Performance (Student) [44]. |
| Dataset Splits | Yes | For hyperparameter selection, we apply the grid search on a set of parameters values and use cross-validation on the training set (see details in Appendix C.2). |
| Hardware Specification | Yes | For experiments on GPT-J, we used p3.8xlarge and p3.2xlarge instances from AWS and RTX3090 GPUs in the local server. |
| Software Dependencies | No | The paper mentions using 'Lo RA' and the 'Open AI API' but does not specify version numbers for these or any other software dependencies in the main text. |
| Experiment Setup | Yes | We use the default cross-entropy loss for token prediction in LMs. Our generic template (without feature names and task description) for sample r is When we have x1=r.x1, x2=r.x2, ..., xp=r.xp, what should be y? | {z } question ### | {z } q/a separator y = r.y | {z } @@@ | {z } end of answer if r has p attributes. ... For hyperparameter selection, we apply the grid search on a set of parameters values and use cross-validation on the training set (see details in Appendix C.2). ... we adjust the generation randomness by increasing the decoding temperature [33, 34, 35] from 0 (deterministic mode) to 0.75 (random mode). |