reproducibilityindex.ai

Dual Space Gradient Descent for Online Learning

Authors: Trung Le, Tu Nguyen, Vu Nguyen, Dinh Phung

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We further provide convergence analysis and extensively conduct experiments on ﬁve real-world datasets to demonstrate the predictive performance and scalability of our proposed method in comparison with the state-of-the-art baselines.In this section, we conduct comprehensive experiments to quantitatively evaluate the performance of our proposed Dual Space Gradient Descent (Dual SGD) on binary classiﬁcation, multiclass classiﬁcation and regression tasks under online settings.
Researcher Affiliation	Academia	Trung Le, Tu Dinh Nguyen, Vu Nguyen, Dinh Phung Centre for Pattern Recognition and Data Analytics Deakin University, Australia
Pseudocode	Yes	Algorithm 1: The learning of Dual Space Gradient Descent.Algorithm 2: k-merging Budget Maintenance Procedure.
Open Source Code	No	The paper mentions that baseline implementations are 'published as a part of LIBSVM, Budgeted SVM and LSOKL toolboxes.' However, there is no explicit statement or link providing the source code for their proposed Dual SGD method.
Open Datasets	Yes	We use 5 datasets which are ijcnn1, cod-rna, poker, year, and airlines. The datasets where purposely are selected with various sizes in order to clearly expose the differences among scalable capabilities of the models. ... These datasets can be downloaded from LIBSVM1 and UCI2 websites, except the airlines which was obtained from American Statistical Association (ASA3).
Dataset Splits	Yes	For each dataset, we perform 10 runs on each algorithm with different random permutations of the training data samples. In each run, the model is trained in a single pass through the data. Its prediction result and time spent are then reported by taking the average together with the standard deviation over all runs. For comparison, we employ 11 state-of-the-art online kernel learning methods... Hyperparameters setting. There are a number of different hyperparameters for all methods. Each method requires a different set of hyperparameters, e.g., the regularization parameters (λ in Dual SGD), the learning rates (η in FOGD and NOGD), and the RBF kernel width (γ in all methods). Thus, for a fair comparison, these hyperparameters are speciﬁed using cross-validation on a subset of data. In particular, we further partition the training set into 80% for learning and 20% for validation.
Hardware Specification	Yes	We use a Windows machine with 3.46GHz Xeon processor and 96GB RAM to conduct our experiments.
Software Dependencies	No	The paper mentions using 'LIBSVM', 'Budgeted SVM', and 'LSOKL' toolboxes but does not specify any version numbers for these or any other software dependencies.
Experiment Setup	Yes	Hyperparameters setting. There are a number of different hyperparameters for all methods. Each method requires a different set of hyperparameters, e.g., the regularization parameters (λ in Dual SGD), the learning rates (η in FOGD and NOGD), and the RBF kernel width (γ in all methods). ...The ranges are given as follows: C {2 5, 2 3, ..., 215}, λ {2 4/N, 2 2/N, ..., 216/N}, γ {2 8, 2 4, 2 2, 20, 22, 24, 28}, and η {2 4, 2 3, ..., 2 1, 21, 22..., 24} where N is the number of data points. The budget size B, merging size k and random feature dimension D of Dual SGD are selected following the approach described in Section 3.2. For a good trade-off between classiﬁcation performance and computational cost, we select B = 100 and D = 200 which achieves fairly comparable classiﬁcation result and running time.