Accelerating Stratified Sampling SGD by Reconstructing Strata

Authors: Weijie Liu, Hui Qian, Chao Zhang, Zebang Shen, Jiahao Xie, Nenggan Zheng

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments corroborate our theory and demonstrate that SGD-RS achieves at least 3.48-times speed-ups compared to vanilla mini-batch SGD. and In this section, we compare SGD-RS with state-of-the-art algorithms, including SGD-ss [Zhao and Zhang, 2014], PDS [Zhang et al., 2019], Upper-bound [Katharopoulos and Fleuret, 2018], RAIS [Johnson and Guestrin, 2018], VRB [Borsos et al., 2018], and vanilla mini-batch SGD.
Researcher Affiliation Academia 1Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou, Zhejiang, China 2College of Computer Science and Technology, Zhejiang University, Hangzhou, Zhejiang, China 3University of Pennsylvania, Philadelphia, Pennsylvania
Pseudocode Yes Algorithm 1 Stochastic Stratifying and Algorithm 2 SGD-RS
Open Source Code No The paper does not provide a direct statement or link to the source code for the described methodology.
Open Datasets Yes We conduct logistic regression experiments on three real-world benchmark datasets: rcv1, ijcnn1, and w8a2. 2These datasets are downloaded from libsvm websites https:// www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/ and We evaluate the empirical performance of SGD-RS in image classification benchmark datasets: MNIST, CIFAR10, and CIFAR100.
Dataset Splits No The paper uses well-known datasets but does not explicitly provide specific train/validation/test split percentages or sample counts, nor does it refer to predefined splits with citations for reproducibility.
Hardware Specification No The paper does not provide any specific hardware details such as CPU/GPU models or memory specifications used for running the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers, such as programming languages or specific machine learning libraries used for implementation.
Experiment Setup Yes The datasets and the corresponding parameter setup are summarized in Table 1. and On MNIST, we train a simple network that has three fully-connected layers and two Re LU layers. We train VGG-11 [Simonyan and Zisserman, 2014] on CIFAR10 and Res Net-18 [He et al., 2016] on CIFAR100 respectively and the networks are initialized by running vanilla mini-batch SGD for 50 epochs.