Stochastic Structured Prediction under Bandit Feedback

Authors: Artem Sokolov, Julia Kreutzer, Stefan Riezler, Christopher Lo

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present an experimental evaluation on problems of natural language processing over exponential output spaces, and compare convergence speed across different objectives under the practical criterion of optimal task performance on development data and the optimization-theoretic criterion of minimal squared gradient norm.
Researcher Affiliation Collaboration Computational Linguistics & IWR, Heidelberg University, Germany {sokolov,kreutzer,riezler}@cl.uni-heidelberg.de Department of Mathematics, Tufts University, Boston, MA, USA chris.aa.lo@gmail.com Amazon Development Center, Berlin, Germany
Pseudocode Yes Algorithm 1 Bandit Structured Prediction
Open Source Code No The paper does not provide explicit access to the source code for the methodology described.
Open Datasets Yes Domain adaptation from Europarl to News Commentary domains using the data provided at the WMT 2007 shared task is performed for French-to-English translation. (http://www.statmt.org/wmt07/shared-task.html) ... conditional random fields (CRF) are applied to the noun phrase chunking task on the Co NLL2000 dataset7. (http://www.cnts.ua.ac.be/conll2000/chunking/)
Dataset Splits Yes This instantiates the selection criterion in line (8) in Algorithm 1 to an evaluation of the respective task loss function (ˆywt(x)) under MAP prediction ˆyw(x) = arg maxy Y(x) pw(y|x) on the development data.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions software like 'cdec' [5] but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The meta-parameter settings were determined on dev sets for constant learning rate γ, clipping constant k, ℓ2 regularization constant λ. (Table 1 specifies values for γ, λ, and k for each algorithm and task).