High-Dimensional Optimization in Adaptive Random Subspaces
Authors: Jonathan Lacotte, Mert Pilanci, Marco Pavone
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results show that the proposed approach offers significant speed ups in machine learning problems including logistic regression, kernel classification with random convolution layers and shallow neural networks with rectified linear units. ... 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada. ... We evaluate Algorithm 1 on MNIST and CIFAR10. First, we aim to show that the sketching dimension can be considerably smaller than the original dimension while retaining (almost) the same test classification accuracy. Second, we aim to get significant speed-ups in achieving a high-accuracy classifier. |
| Researcher Affiliation | Academia | Jonathan Lacotte Department of Electrical Engineering Stanford University lacotte@stanford.edu Mert Pilanci Department of Electrical Engineering Stanford University Marco Pavone Department of Aeronautics &Astronautics Stanford University |
| Pseudocode | Yes | Algorithm 1: Generic algorithm for adaptive sketching. ... Algorithm 2: Iterative adaptive sketching |
| Open Source Code | No | The paper does not provide any explicit statement or link to its open-source code for the methodology. |
| Open Datasets | Yes | We evaluate Algorithm 1 on MNIST and CIFAR10. ... For both datasets, we use 50000 training and 10000 testing images. |
| Dataset Splits | No | The paper specifies 50000 training and 10000 testing images but does not mention a separate validation split or the methodology for creating one. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'scikit-learn: Machine learning in Python.' [18] as a citation, but does not specify software dependencies with version numbers for its own implementation. |
| Experiment Setup | Yes | To solve the primal program (1), we use two standard algorithms, stochastic gradient descent (SGD) with (best) fixed step size and stochastic variance reduction gradient (SVRG) [16] with (best) fixed step size and frequency update of the gradient correction. To solve the adaptive sketched program (2), we use SGD, SVRG and the sub-sampled Newton method [4, 10] which we refer to as Sketch-SGD, Sketch-SVRG and Sketch-Newton. ... For MNIST and CIFAR10, we choose respectively D = 10000 and γ = 0.02, and, D = 60000 and γ = 0.002... using a one-vs-all procedure. ... for values of λ {10 4, 5 10 5, 10 5, 5 10 6} and sketching dimensions m {64, 128, 256, 512, 1024}. |