Efficient Adaptive Online Learning via Frequent Directions
Authors: Yuanyu Wan, Nan Wei, Lijun Zhang
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To verify the efficiency and effectiveness of our ADA-FD, we conduct several numerical experiments on online convex optimization and training CNN. The results turn out that our ADA-FD performs comparably with ADAFULL but is much more efficient. |
| Researcher Affiliation | Academia | Yuanyu Wan, Nan Wei and Lijun Zhang National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China {wanyy, zhanglj}@lamda.nju.edu.cn, nwei@smail.nju.edu.cn |
| Pseudocode | Yes | Algorithm 1 Adaptive Dual Averaging via Frequent Directions Algorithm 2 Adaptive Mirror Descent via Frequent Directions |
| Open Source Code | No | No explicit statement about open-sourcing the code or a link to a repository is provided. |
| Open Datasets | Yes | two real world datasets from LIBSVM repository [Chang and Lin, 2011]: Gisette and Epsilon MNIST [Le Cun et al., 1998], CIFAR10 [Krizhevsky, 2009] and SVHN datasets [Netzer et al., 2011] |
| Dataset Splits | No | The paper explicitly states the datasets are divided into "training part and testing part" with specific numbers (e.g., "Gisette 6,000/1,000" and "Epsilon 400,000/100,000"). However, it does not explicitly mention a separate "validation" split with percentages or counts, although parameter tuning is implied. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) are mentioned for running experiments. |
| Software Dependencies | No | The paper mentions "Keras examples directory" but does not provide specific version numbers for Keras or any other software dependencies. |
| Experiment Setup | Yes | Parameters η and δ are searched in {1e 4, 1e 3, , 100} (for online regression/classification) and {1e 8, 1e 7, , 1} (for CNN). we set τ = 10 for methods using matrix approximation. we set the sketching size τ = 10 for Gisette and τ = 40 for Epsilon. batch size 128 we set the sketching size τ = 20 for all datasets. |