Improving the Generalization Performance of Multi-class SVM via Angular Regularization
Authors: Jianxin Li, Haoyi Zhou, Pengtao Xie, Yingchun Zhang
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On various datasets, we demonstrate the efficacy of the regularizer in reducing overfitting. Table 1 shows the classification accuracy of ℓ2-regularized MSVM on several datasets, where the gap between training and testing accuracy are still substantial. In this paper, we study a new type of regularizer that encourages the coefficient vectors (equivalently, the hyperplanes parameterized by them) in MSVM to have large angles, for the purpose to control overfitting. Fig. 1 illustrates the idea. Section 4 Experiments In this section, we present experimental results. We evaluated our method on ten datasets. Table 2 summaries their statistics. Table 3 shows the classification results on six datasets. |
| Researcher Affiliation | Collaboration | Jianxin Li1, Haoyi Zhou1, Pengtao Xie2,3, Yingchun Zhang1 1 School of Computer Science and Engineering, Beihang University 2 Machine Learning Department, Carnegie Mellon University 3 Petuum Inc, USA {lijx, zhouhy, zhangyc}@act.buaa.edu.cn, pengtaox@cs.cmu.edu |
| Pseudocode | No | The paper describes algorithms and methods (e.g., stochastic sub-gradient method) but does not provide a formal pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing the source code or a link to a code repository. |
| Open Datasets | Yes | Table 2: Statistics of Datasets. Dataset #Classes #Train #Test #Features Yale B 38 1500 914 1024 Image Net-50 50 30K 10K 128 Covtype 7 100K 40K 54 Shuttle 7 30450 14500 9 New-thyroid 3 108 107 5 Yeast 10 1134 350 8 Dermatology 6 323 35 33 Page-Blocks 5 4924 548 10 Wine-Quality-Red 6 1439 160 11 Zoo 7 89 12 16 |
| Dataset Splits | Yes | The regularization parameters λ and β are tuned in the range [2−20, 2−19, . . . , 220] via 5-fold cross validations. All the experiments are conducted over 10 random train/test splits and the results are averaged over the 10 runs. |
| Hardware Specification | Yes | The algorithms were implemented in MATLAB and the experiments were run on a Linux machine with a 2.00GHz Xeon CPU and 256G memory. |
| Software Dependencies | No | The paper mentions "implemented in MATLAB" but does not specify a version number for MATLAB or any other libraries or software dependencies with their versions. |
| Experiment Setup | Yes | The regularization parameters λ and β are tuned in the range [2−20, 2−19, . . . , 220] via 5-fold cross validations. In the stochastic sub-gradient descent algorithm, the mini-batch size and number of epochs are set to 20 and 50 respectively. The learning rate is set according to ADADELTA [Zeiler, 2012] in all methods. |