Closing the Curious Case of Neural Text Degeneration
Authors: Matthew Finlayson, John Hewitt, Alexander Koller, Swabha Swayamdipta, Ashish Sabharwal
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a pilot investigation ( 5) to empirically evaluate this basis-aware truncation sampling approach. Our results shows improvements on an open-ended generation task via both automatic and human evaluation metrics under low-entropy generation (i.e., close to greedy). |
| Researcher Affiliation | Collaboration | Matthew Finlayson University of Southern California John Hewitt Stanford University Alexander Koller Saarland University Swabha Swayamdipta University of Southern California Ashish Sabharwal The Allen Institute for AI |
| Pseudocode | Yes | Algorithm 1 gives the procedure for BAT sampling. Algorithm 1 BAT sampling |
| Open Source Code | Yes | Code for experiments: https://github.com/mattf1n/basis-aware-threshold. |
| Open Datasets | Yes | We generate completions for 5000 35-token prefixes taken from the Open Web Text (OWT) (Gokaslan et al., 2019). |
| Dataset Splits | Yes | We perform a parameter sweep for nucleus, η, and ϵ sampling and select the parameter that gives the highest MAUVE score on the OWT validation set (see Table 3 in the appendix). |
| Hardware Specification | No | The paper mentions running experiments with 'GPT-2' and notes that 'No open-source solver we tried was able to solve a single problem in a reasonable amount of time... Proprietary solvers do better in some cases, but only the MOSEK solver (Ap S, 2023) was able to solve the full problem in under 1 minute.' However, it does not specify any particular CPU, GPU, or TPU models, or other specific hardware configurations used for running the experiments. |
| Software Dependencies | Yes | Proprietary solvers do better in some cases, but only the MOSEK solver (Ap S, 2023) was able to solve the full problem in under 1 minute. ... MOSEK Ap S. MOSEK Optimizer API for Python 9.3.22. Version 10.0., 2023. |
| Experiment Setup | Yes | We perform a parameter sweep for nucleus, η, and ϵ sampling and select the parameter that gives the highest MAUVE score on the OWT validation set (see Table 3 in the appendix). We control for the parameter choice in comparisons between BAT methods and their vanilla counterparts, by matching the parameters by selecting the BAT parameter that rejects the same proportion of tokens from corpus of human text as the vanilla method; see Appendix F for more details. ... We expose δ as a parameter to tune the restrictiveness of the sampling method. |