Out-of-Distribution Detection and Selective Generation for Conditional Language Models

Authors: Jie Ren, Jiaming Luo, Yao Zhao, Kundan Krishna, Mohammad Saleh, Balaji Lakshminarayanan, Peter J Liu

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a highly accurate and lightweight OOD detection method for CLMs, and demonstrate its effectiveness on abstractive summarization and translation.
Researcher Affiliation Collaboration 1Google Research 2Carnegie Mellon University, work done while at Google Research
Pseudocode Yes Algorithm 1 Fitting Gaussians for input and output embeddings
Open Source Code No Not found. The paper does not contain an explicit statement about releasing the source code for the described methodology or a direct link to a code repository.
Open Datasets Yes We fine-tuned PEGASUSLARGE (Zhang et al., 2020) on the xsum (Narayan et al., 2018) dataset, consisting of BBC News articles with short, abstractive summaries.
Dataset Splits No Not found. The paper describes training and testing datasets, but does not explicitly state the use of validation splits or their sizes/strategies for reproducibility.
Hardware Specification No Not found. The paper does not specify the hardware used for running the experiments (e.g., GPU models, CPU types, or cloud computing instances).
Software Dependencies No Not found. The paper mentions the use of an 'Adafactor optimizer' but does not provide specific software names with version numbers for reproducibility (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes The model is trained with Adafactor optimizer (Shazeer & Stern, 2018) for 2M steps with 0.1 dropout and 1024 batch size. Decoding is done using beam search with 10 beam size and α = 0.6 length normalization (Wu et al., 2016b).