Discovering Latent Knowledge in Language Models Without Supervision
Authors: Collin Burns, Haotian Ye, Dan Klein, Jacob Steinhardt
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | evaluated across 6 models and 10 question-answering datasets, CCS outperforms the accuracy of strong zero-shot baselines by 4% on average (Section 3.2.1). We systematically analyze CCS to understand the features it discovers. |
| Researcher Affiliation | Academia | Collin Burns UC Berkeley Haotian Ye Peking University Dan Klein UC Berkeley Jacob Steinhardt UC Berkeley |
| Pseudocode | Yes | Algorithm 1 Pseudocode for Getting Contrast Features |
| Open Source Code | Yes | We provide code at https://www.github.com/collin-burns/discovering_latent_knowledge. |
| Open Datasets | Yes | We test models on 10 datasets: sentiment classification (IMDB (Maas et al., 2011) and Amazon (Mc Auley & Leskovec, 2013)), topic classification (AG-News (Zhang et al., 2015) and DBpedia-14 (Lehmann et al., 2015)), NLI (RTE (Wang et al., 2018) and QNLI (Rajpurkar et al., 2016)), story completion (COPA (Roemmele et al., 2011) and Story-Cloze (Mostafazadeh et al., 2017)), question answering (Bool Q (Clark et al., 2019)), and common sense reasoning (PIQA (Bisk et al., 2020)). |
| Dataset Splits | No | randomly split each dataset into an unsupervised training set (60% of the data) and test set (40%). The paper does not explicitly mention a separate validation dataset split. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running experiments. |
| Software Dependencies | No | The paper mentions 'Huggingface library' and 'Adam optimizer' but does not provide specific version numbers for these software dependencies or other key libraries like Python or PyTorch/TensorFlow. |
| Experiment Setup | Yes | When testing CCS, we optimize it 10 times using Adam W (Loshchilov & Hutter, 2017) with learning rate 0.01, then take the run with the lowest unsupervised loss. ... In practice, we train each time for E = 1000 epochs with a learning rate η = 0.01 (which we found was good for consistently achieving low unsupervised loss) in each run. |