Improving the Generalization Performance of Multi-class SVM via Angular Regularization

Authors: Jianxin Li, Haoyi Zhou, Pengtao Xie, Yingchun Zhang

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On various datasets, we demonstrate the efficacy of the regularizer in reducing overfitting. Table 1 shows the classification accuracy of ℓ2-regularized MSVM on several datasets, where the gap between training and testing accuracy are still substantial. In this paper, we study a new type of regularizer that encourages the coefficient vectors (equivalently, the hyperplanes parameterized by them) in MSVM to have large angles, for the purpose to control overfitting. Fig. 1 illustrates the idea. Section 4 Experiments In this section, we present experimental results. We evaluated our method on ten datasets. Table 2 summaries their statistics. Table 3 shows the classification results on six datasets.
Researcher Affiliation Collaboration Jianxin Li1, Haoyi Zhou1, Pengtao Xie2,3, Yingchun Zhang1 1 School of Computer Science and Engineering, Beihang University 2 Machine Learning Department, Carnegie Mellon University 3 Petuum Inc, USA {lijx, zhouhy, zhangyc}@act.buaa.edu.cn, pengtaox@cs.cmu.edu
Pseudocode No The paper describes algorithms and methods (e.g., stochastic sub-gradient method) but does not provide a formal pseudocode or algorithm block.
Open Source Code No The paper does not provide an explicit statement about releasing the source code or a link to a code repository.
Open Datasets Yes Table 2: Statistics of Datasets. Dataset #Classes #Train #Test #Features Yale B 38 1500 914 1024 Image Net-50 50 30K 10K 128 Covtype 7 100K 40K 54 Shuttle 7 30450 14500 9 New-thyroid 3 108 107 5 Yeast 10 1134 350 8 Dermatology 6 323 35 33 Page-Blocks 5 4924 548 10 Wine-Quality-Red 6 1439 160 11 Zoo 7 89 12 16
Dataset Splits Yes The regularization parameters λ and β are tuned in the range [2−20, 2−19, . . . , 220] via 5-fold cross validations. All the experiments are conducted over 10 random train/test splits and the results are averaged over the 10 runs.
Hardware Specification Yes The algorithms were implemented in MATLAB and the experiments were run on a Linux machine with a 2.00GHz Xeon CPU and 256G memory.
Software Dependencies No The paper mentions "implemented in MATLAB" but does not specify a version number for MATLAB or any other libraries or software dependencies with their versions.
Experiment Setup Yes The regularization parameters λ and β are tuned in the range [2−20, 2−19, . . . , 220] via 5-fold cross validations. In the stochastic sub-gradient descent algorithm, the mini-batch size and number of epochs are set to 20 and 50 respectively. The learning rate is set according to ADADELTA [Zeiler, 2012] in all methods.