reproducibilityindex.ai

Scalable Semi-Supervised SVM via Triply Stochastic Gradients

Authors: Xiang Geng, Bin Gu, Xiang Li, Wanli Shi, Guansheng Zheng, Heng Huang

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results on a variety of datasets demonstrate that TSGS3VM is much more efﬁcient and scalable than existing S3VM algorithms.
Researcher Affiliation	Collaboration	1School of Computer & Software, Nanjing University of Information Science & Technology, P.R.China 2JD Finance America Corporation 3Department of Electrical & Computer Engineering, University of Pittsburgh, USA 4Computer Science Department, University of Western Ontario, Canada
Pseudocode	Yes	Algorithm 1 TSGS3VM Train [...] Algorithm 2 TSGS3VM Predict
Open Source Code	No	The paper states, "We implemented the TSGS3VM algorithm in MATLAB." However, it does not provide a link to the code or explicitly state that the code is publicly available.
Open Datasets	Yes	Table 3 summarizes the 8 datasets used in our experiments. They are from LIBSVM3 and UCI4 repositories. [Footnote 3: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/] [Footnote 4: http://archive.ics.uci.edu/ml/datasets.html]
Dataset Splits	Yes	5-fold cross-validation was used to determine the optimal settings (test error) of the model parameters (the regularization factor C and the Gaussian kernel parameter σ), the parameters C was set to C nl / nu . Speciﬁcally, the unlabeled dataset was divided evenly to 5 subsets, where one of the subsets and all the labeled data are used for training, while the other 4 subsets are used for testing.
Hardware Specification	Yes	We perform experiments on Intel Xeon E5-2696 machine with 48GB RAM.
Software Dependencies	No	The paper states, "We implemented the TSGS3VM algorithm in MATLAB." However, it does not specify a version number for MATLAB or any other software dependencies, which is required for reproducibility.
Experiment Setup	Yes	The Gaussian RBF kernel k(x, x ) = exp( σ\|\|x x \|\|2) and the loss function u = max{0, 1 \|r\|} was used for all algorithms. 5-fold cross-validation was used to determine the optimal settings (test error) of the model parameters (the regularization factor C and the Gaussian kernel parameter σ), the parameters C was set to C nl / nu . Parameter search was done on a 7 7 coarse grid linearly spaced in the region {log10 C, log10 σ)\| 3 log10 C 3, 3 log10 σ 3} for all methods. For TSGS3VM, the step size γ equals 1 / η, where 0 log10 η 3 is searched after C and σ. Besides, the number of random features is set to be n and the batch size is set to 256. The test error was obtained by using these optimal model parameters for all the methods. To achieve a comparable accuracy to our TSGS3VM, we set the minimum budget sizes Bl and Bu as 100 and 0.2 nu respectively for BGS3VM. We stop TSGS3VM and BGS3VM after one pass over the entire dataset. We stop FRS3VM after 10 pass over the entire dataset to achieve a comparable accuracy.