Learning to Reject with a Fixed Predictor: Application to Decontextualization

Authors: Christopher Mohri, Daniel Andor, Eunsol Choi, Michael Collins, Anqi Mao, Yutao Zhong

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For evaluation, we choose the decontextualization task, and provide a manually-labelled dataset of 2,000 examples. Our algorithm significantly outperforms the baselines considered, with a 25% improvement in coverage when halving the error rate, which is only 3% away from the theoretical limit.
Researcher Affiliation Collaboration Christopher Mohri1, Daniel Andor2, Eunsol Choi3, Michael Collins2, Anqi Mao4, Yutao Zhong4 1Stanford University, 2Google, 3The University of Texas at Austin, 4Courant Institute
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks. Methods are described in prose.
Open Source Code No The paper does not provide an explicit statement about releasing source code for their methodology or a link to a code repository.
Open Datasets Yes For our experiments, we labeled 2,000 decontextualizations of a fixed MT5 XXL model (Xue et al., 2020) ourselves... We randomly split our 2,000 annotation examples into 1,500 train/500 validation examples and perform 4-fold cross-validation... We provide additional empirical evaluation on two simpler image classification datasets: Fashion-MNIST (Xiao et al., 2017) and KMNIST (Clanuwat et al., 2018).
Dataset Splits Yes We randomly split our 2,000 annotation examples into 1,500 train/500 validation examples and perform 4-fold cross-validation.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU specifications, or memory amounts.
Software Dependencies Yes We further fine-tune a T5X 1.1 XXL decontextualization model (Roberts et al., 2022)...
Experiment Setup Yes We perform a hyper-parameter search over {1e 4,1e 3,1e 2} for the learning rate, and {0,0.05,...,0.2} for the dropout rate.