Principled Gradient-Based MCMC for Conditional Sampling of Text
Authors: Li Du, Afra Amini, Lucas Torroba Hennigen, Xinyan Velocity Yu, Holden Lee, Jason Eisner, Ryan Cotterell
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments on various forms of text generation, we demonstrate that our unbiased samplers are able to generate more fluent text while better adhering to the control objectives. |
| Researcher Affiliation | Academia | 1Johns Hopkins University 2ETH Z urich 3MIT CSAIL 4University of Southern California. |
| Pseudocode | No | The paper does not contain any clearly labeled "Pseudocode" or "Algorithm" blocks. |
| Open Source Code | No | The paper does not provide any statements about code release or links to a source code repository. |
| Open Datasets | Yes | For topic controlled task...E2E dataset (Novikova et al., 2017). For sentiment controlled task...SST2 dataset of movie reviews (Socher et al., 2013). For position constrained task...COLLIE (Yao et al., 2024). |
| Dataset Splits | No | The paper mentions evaluating classifiers on a "test set" but does not provide specific training, validation, or test dataset splits (e.g., percentages or sample counts) for the primary datasets used in the MCMC experiments or for training the language models/classifiers from scratch. |
| Hardware Specification | No | The experiments in this work were carried out at the Advanced Research Computing at Hopkins (ARCH) core facility, which is supported by the National Science Foundation (NSF) grant number OAC 1920103. |
| Software Dependencies | No | The paper mentions using "GPT-2 checkpoint from the Huggingface library" but does not specify version numbers for Python, PyTorch, or other key software dependencies. |
| Experiment Setup | Yes | All step sizes are tuned with grid search with a grid resolution of 0.1. For the Toy Example, the inverse temperature β = 0.42 and the sequence length (i.e., the number of spins in the Ising model) is N = 5. The step size for MUCOLA is 1.5, the trajectory length of SVS is 2π, and the step size of p-NCG and Gw L are both 1.0. |