A Smoother Way to Train Structured Prediction Models
Authors: Venkata Krishna Pillutla, Vincent Roulet, Sham M. Kakade, Zaid Harchaoui
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experimental results on two real-world problems, namely named entity recognition and visual object localization. The experimental results show that the proposed framework allows us to build upon efficient inference algorithms to develop large-scale optimization algorithms for structured prediction which can achieve competitive performance on the two real-world problems. |
| Researcher Affiliation | Academia | Krishna Pillutla, Vincent Roulet, Sham M. Kakade, Zaid Harchaoui Paul G. Allen School of Computer Science & Engineering and Department of Statistics University of Washington name@uw.edu |
| Pseudocode | Yes | Algorithm 1 Catalyst with smoothing |
| Open Source Code | Yes | The code is publicly available on the authors websites. |
| Open Datasets | Yes | We consider the Co NLL 2003 dataset with n = 14987 [63]. We consider the PASCAL VOC 2007 [13] dataset |
| Dataset Splits | No | The paper mentions using a 'held-out set' and 'validation F1 score' for tuning, but does not provide specific details on the dataset splits (e.g., percentages or sample counts for train/validation/test). |
| Hardware Specification | No | No specific hardware details (such as GPU/CPU models, memory, or cloud instance types) used for running experiments were provided in the paper. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with versions) needed to replicate the experiment. |
| Experiment Setup | Yes | BCFW requires no tuning, while SGD requires the tuning of γ0 and t0. The SVRGbased methods require the tuning of a fixed learning rate. Moreover, SVRG and SC-SVRG-const also require tuning the amount of smoothing µ. [...] A fixed budget Tinner = n is used as the stopping criteria in Algorithm 1. [...] We use the value κk = λ for SC-SVRG-adapt. All smooth optimization methods turned out to be robust to the choice of K for the top-K oracle (Fig. 3) we use K = 5 for named entity recognition and K = 10 for visual object localization. |