reproducibilityindex.ai

A Smoother Way to Train Structured Prediction Models

Authors: Venkata Krishna Pillutla, Vincent Roulet, Sham M. Kakade, Zaid Harchaoui

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present experimental results on two real-world problems, namely named entity recognition and visual object localization. The experimental results show that the proposed framework allows us to build upon efﬁcient inference algorithms to develop large-scale optimization algorithms for structured prediction which can achieve competitive performance on the two real-world problems.
Researcher Affiliation	Academia	Krishna Pillutla, Vincent Roulet, Sham M. Kakade, Zaid Harchaoui Paul G. Allen School of Computer Science & Engineering and Department of Statistics University of Washington name@uw.edu
Pseudocode	Yes	Algorithm 1 Catalyst with smoothing
Open Source Code	Yes	The code is publicly available on the authors websites.
Open Datasets	Yes	We consider the Co NLL 2003 dataset with n = 14987 [63]. We consider the PASCAL VOC 2007 [13] dataset
Dataset Splits	No	The paper mentions using a 'held-out set' and 'validation F1 score' for tuning, but does not provide specific details on the dataset splits (e.g., percentages or sample counts for train/validation/test).
Hardware Specification	No	No specific hardware details (such as GPU/CPU models, memory, or cloud instance types) used for running experiments were provided in the paper.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with versions) needed to replicate the experiment.
Experiment Setup	Yes	BCFW requires no tuning, while SGD requires the tuning of γ0 and t0. The SVRGbased methods require the tuning of a ﬁxed learning rate. Moreover, SVRG and SC-SVRG-const also require tuning the amount of smoothing µ. [...] A ﬁxed budget Tinner = n is used as the stopping criteria in Algorithm 1. [...] We use the value κk = λ for SC-SVRG-adapt. All smooth optimization methods turned out to be robust to the choice of K for the top-K oracle (Fig. 3) we use K = 5 for named entity recognition and K = 10 for visual object localization.