Learning Beam Search Policies via Imitation Learning
Authors: Renato Negrinho, Matthew Gormley, Geoffrey J. Gordon
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our contributions are: an algorithm for learning beam search policies (Section 4.2) with accompanying regret guarantees (Section 5), a meta-algorithm that captures much of the existing literature (Section 4), and new theoretical results for the early update [6] and La SO [7] algorithms (Section 5.3). |
| Researcher Affiliation | Collaboration | 1Machine Learning Department, Carnegie Mellon University 2Microsoft Research |
| Pseudocode | Yes | Algorithm 1 Beam Search Algorithm 2 Meta-algorithm |
| Open Source Code | No | The paper does not include any statements about releasing code or links to a code repository. |
| Open Datasets | No | The paper defines 'Input-output training pairs D = {(x1, y1), . . . , (xm, ym)}' as part of its theoretical framework, but does not specify any publicly available datasets used for empirical training. |
| Dataset Splits | No | The paper mentions 'return best θt on validation' within its meta-algorithm (Algorithm 2), but does not provide specific details on empirical validation splits or methodology as it is a theoretical paper. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, memory, or cloud instance types used for experiments. |
| Software Dependencies | No | The paper mentions 'Adam [18]' as an online optimization algorithm but does not specify its version number or any other software dependencies with their versions. |
| Experiment Setup | No | The paper does not include specific experimental setup details such as hyperparameter values, training configurations, or system-level settings, consistent with its theoretical focus. |