Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo

Authors: Stephen Zhao, Rob Brekelmans, Alireza Makhzani, Roger Baker Grosse

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now illustrate empirically how our framework can be used to evaluate inference through log Zσ bounds and KL divergences between the sampling and target distributions, providing meaningful quantitative comparison between various learning methods. We consider a range of tasks throughout this section, including toxic story generation (as an example of uncovering rare undesirable behavior), generating reviews with varied sentiment, and infilling. For the toxicity and infilling tasks, we consider the Tiny Stories model (Eldan & Li, 2023) as a small-scale model where the generation is coherent, and use the prompt of Once upon a time, there was a . For the toxicity task, we elicit responses judged to be toxic by the classifier from Corrˆea (2023). For the sentiment task, we consider the GPT2-Medium model (Radford et al., 2019) and a classifier trained on Amazon reviews (Li, 2023).
Researcher Affiliation Academia 1University of Toronto 2Vector Institute. Correspondence to: {stephenzhao, makhzani, rgrosse} @cs.toronto.edu, rob.brekelmans@vectorinstitute.ai.
Pseudocode Yes Algorithm 1 (Twisted) SMC Sampling (q SMC)
Open Source Code Yes Our code is available at https://github.com/Silent-Zebra/twisted-smc-lm .
Open Datasets Yes For the toxicity and infilling tasks, we consider the Tiny Stories model (Eldan & Li, 2023)... For the toxicity task, we elicit responses judged to be toxic by the classifier from Corrˆea (2023). For the sentiment task, we consider the GPT2-Medium model (Radford et al., 2019) and a classifier trained on Amazon reviews (Li, 2023).
Dataset Splits No The paper mentions batch sizes and training steps but does not provide specific train/validation/test dataset splits with percentages or counts for their experiments.
Hardware Specification Yes All of our experiments were run on a single GPU, usually on an NVIDIA A40 with 48G memory.
Software Dependencies No The paper mentions the use of Adam optimizer, Hugging Face TRL PPO Trainer, Optax (Flax), and Hugging Face models, but does not provide specific version numbers for these software components.
Experiment Setup Yes We use a batch size (number of SMC particles/samples) of 1000, with a learning rate of 0.0001, and train using CTL for a total of 5000 gradient updates.