Learning to Select from Multiple Options
Authors: Jiangshu Du, Wenpeng Yin, Congying Xia, Philip S. Yu
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our methods are evaluated on three tasks (ultrafine entity typing, intent detection and multi-choice QA) that are typical selection problems with different sizes of options. Experiments show our models set new SOTA performance; particularly, Parallel-TE is faster than the pairwise TE by k times in inference. |
| Researcher Affiliation | Collaboration | Jiangshu Du1, Wenpeng Yin2, Congying Xia3, Philip S. Yu1 1University of Illinois at Chicago, Chicago, IL, USA 2Penn State University, State College, PA, USA 3Salesforce Research, Palo Alto, CA, USA |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement or link for the release of open-source code for the described methodology. |
| Open Datasets | Yes | Our experiments are conducted on three different tasks: ultrafine entity typing, few-shot intent detection and multiple-choice QA. We choose the three tasks since they represent different selection problems in NLP. Ultra-fine entity typing is a multi-label task with a large option space: over 10,000 entity types. The few-shot intent detection task evaluates our proposed models under few-shot selection scene. Multiple-choice QA is a selection problem which requires the model to understand long paragraphs. |
| Dataset Splits | Yes | The annotated examples are equally split into train, dev and test. From the training data, we randomly sample 5-shot and 10-shot instances per intent as our train respectively. We also sample a small portion of the training dataset as our dev, following the previous setting (Zhang et al. 2021; Mehri, Eric, and Hakkani-Tür 2020). |
| Hardware Specification | Yes | Experiments run at an NVIDIA TITAN RTX. The inference speed is measured on an NVIDIA Ge Force RTX 3090 with the evaluation batch size of 256. |
| Software Dependencies | No | The paper mentions using Ro BERTa, BERT, and Sentence-BERT models but does not provide specific version numbers for underlying software dependencies like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | The hyperparameters, threshold τ, and k are searched on the dev set for each task. The inference speed is measured on an NVIDIA Ge Force RTX 3090 with the evaluation batch size of 256. We train both models 5 epochs and report the test accuracy at the end of each epoch. |