On Structured Prediction Theory with Calibrated Convex Surrogate Losses

Authors: Anton Osokin, Francis Bach, Simon Lacoste-Julien

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We provide novel theoretical insights on structured prediction in the context of efficient convex surrogate loss minimization with consistency guarantees. For any task loss, we construct a convex surrogate that can be optimized via stochastic gradient descent and we prove tight bounds on the so-called calibration function relating the excess surrogate risk to the actual risk. We propose a theoretical framework that jointly tackles these two aspects and allows to judge the feasibility of efficient learning.
Researcher Affiliation Academia Anton Osokin INRIA/ENS , Paris, France HSE , Moscow, Russia Francis Bach INRIA/ENS , Paris, France Simon Lacoste-Julien MILA and DIRO Université de Montréal, Canada DI École normale supérieure, CNRS, PSL Research University National Research University Higher School of Economics
Pseudocode No The paper describes the ASGD update rule in text (equation 9), but does not provide a formally labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper does not provide any concrete access information to source code, such as a repository link or an explicit statement about code release.
Open Datasets No The paper is theoretical and does not mention using or providing access to any specific dataset, public or otherwise.
Dataset Splits No The paper is theoretical and does not conduct experiments on specific datasets, thus no dataset split information (training, validation, test) is provided.
Hardware Specification No The paper is purely theoretical and does not describe any experimental setup or the hardware used for computations.
Software Dependencies No The paper is purely theoretical and does not specify any software dependencies with version numbers.
Experiment Setup No The paper is purely theoretical and does not provide details about an experimental setup, such as hyperparameters or system-level training settings.