Learning SMaLL Predictors
Authors: Vikas Garg, Ofer Dekel, Lin Xiao
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We supplement the theoretical foundations of our work with an extensive empirical evaluation. |
| Researcher Affiliation | Collaboration | Vikas K. Garg CSAIL, MIT vgarg@csail.mit.edu Ofer Dekel Microsoft Research oferd@microsoft.com Lin Xiao Microsoft Research lin.xiao@microsoft.com |
| Pseudocode | Yes | Algorithm 1 Customized Mirror-Prox algorithm for solving the saddle-point problem (13); Algorithm 2 .Proj E/ Projection onto the set Ej j 2 Rn W ji 2 Œ0; 1 ; k j k1 k |
| Open Source Code | No | The paper does not provide any explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | We experimented with Open ML data for two main reasons: (a) it contains many preprocessed binary datasets, and (b) the datasets come from diverse domains. |
| Dataset Splits | Yes | Since the datasets do not specify separate train, validation, and test sets, we measure test accuracy by averaging over five random train-test splits. ... We determined hyperparameters by 5-fold cross-validation. |
| Hardware Specification | No | The paper does not specify any particular hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions various algorithms (LSVM, RF, AB, LR, DT, k NN, RSVM, GB, GP, SMa LL, Proto NN, Bonsai) but does not provide specific version numbers for software libraries or dependencies. |
| Experiment Setup | Yes | We determined hyperparameters by 5-fold cross-validation. The coefficient of the error term C in LSVM and 2-regularized LR was selected from f0:1; 1; 10; 100g. In the case of RSVM, we also added 0:01 to the search set for C, and chose the best kernel between a radial basis function (RBF), polynomials of degree 2 and 3, and sigmoid. For the ensemble methods (RF, AB, GB), the number of base predictors was selected from the set f10; 20; 50g. The maximum number of features for RF estimators was optimized over the square root and the log selection criteria. We also found best validation parameters for DT (gini or entropy for attribute selection), k NN (1, 3, 5 or 7 neighbors), and GP (RBF kernel scaled with scaled by a coefficient in the set f0:1; 1:0; 5g and dot product kernel with inhomogeneity parameter set to 1). Finally, for our method SMa LL, we fixed D 0:1 and t D 0:01, and searched over ˇt D ˇ 2 f0:01; 0:001g. |