Stochastic Structured Prediction under Bandit Feedback
Authors: Artem Sokolov, Julia Kreutzer, Stefan Riezler, Christopher Lo
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present an experimental evaluation on problems of natural language processing over exponential output spaces, and compare convergence speed across different objectives under the practical criterion of optimal task performance on development data and the optimization-theoretic criterion of minimal squared gradient norm. |
| Researcher Affiliation | Collaboration | Computational Linguistics & IWR, Heidelberg University, Germany {sokolov,kreutzer,riezler}@cl.uni-heidelberg.de Department of Mathematics, Tufts University, Boston, MA, USA chris.aa.lo@gmail.com Amazon Development Center, Berlin, Germany |
| Pseudocode | Yes | Algorithm 1 Bandit Structured Prediction |
| Open Source Code | No | The paper does not provide explicit access to the source code for the methodology described. |
| Open Datasets | Yes | Domain adaptation from Europarl to News Commentary domains using the data provided at the WMT 2007 shared task is performed for French-to-English translation. (http://www.statmt.org/wmt07/shared-task.html) ... conditional random fields (CRF) are applied to the noun phrase chunking task on the Co NLL2000 dataset7. (http://www.cnts.ua.ac.be/conll2000/chunking/) |
| Dataset Splits | Yes | This instantiates the selection criterion in line (8) in Algorithm 1 to an evaluation of the respective task loss function (ˆywt(x)) under MAP prediction ˆyw(x) = arg maxy Y(x) pw(y|x) on the development data. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software like 'cdec' [5] but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | The meta-parameter settings were determined on dev sets for constant learning rate γ, clipping constant k, ℓ2 regularization constant λ. (Table 1 specifies values for γ, λ, and k for each algorithm and task). |