Structure Regularization for Structured Prediction

Authors: Xu Sun

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show both theoretically and empirically that structure regularization can effectively control overfitting risk and lead to better accuracy. Experiments on well-known tasks demonstrate that our method can easily beat the benchmark systems on those highly-competitive tasks, achieving record-breaking accuracies yet with substantially faster training speed.
Researcher Affiliation Academia Xu Sun MOE Key Laboratory of Computational Linguistics, Peking University School of Electronics Engineering and Computer Science, Peking University xusun@pku.edu.cn
Pseudocode Yes Algorithm 1 Training with structure regularization
Open Source Code Yes See the code at http://klcl.pku.edu.cn/member/sunxu/code.htm
Open Datasets Yes Part-of-Speech Tagging (POS-Tagging). We use the standard benchmark dataset in prior work [3], with 38,219 training samples and 5,462 test samples. ... Biomedical Named Entity Recognition (Bio-NER). This task is from the Bio NLP-2004 shared task [19]. There are 17,484 training samples and 3,856 test samples. ... Word Segmentation (Word-Seg). We use the MSR data provided by SIGHAN-2004 contest [4]. There are 86,918 training samples and 3,985 test samples. ... Sensor-based Human Activity Recognition (Act-Recog). ... with the data extracted from the Bao04 activity recognition dataset [15]. ...There are 16,000 training samples and 4,000 test samples.
Dataset Splits Yes For Weight Reg, the L2 regularization strengths (i.e., λ/2 in Eq.(8)) are tuned among values 0.1, 0.5, 1, 2, 5, and are determined on the development data (POS-Tagging) or simply via 4-fold cross validation on the training set (Bio-NER, Word-Seg, and Act-Recog).
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., CPU or GPU models, memory, or specific cloud instances).
Software Dependencies No The paper mentions software components like CRFs, structured perceptrons, and SGD, but it does not list specific version numbers for any programming languages, libraries, or frameworks used in the implementation or experimentation (e.g., Python, PyTorch, scikit-learn versions).
Experiment Setup Yes For Weight Reg, the L2 regularization strengths (i.e., λ/2 in Eq.(8)) are tuned among values 0.1, 0.5, 1, 2, 5, and are determined on the development data (POS-Tagging) or simply via 4-fold cross validation on the training set (Bio-NER, Word-Seg, and Act-Recog). With this automatic tuning for Weight Reg, we set 2, 5, 1 and 5 for POS-Tagging, Bio-NER, Word-Seg, and Act-Recog tasks, respectively. ... in experiments we use the SGD with decaying learning rate.