reproducibilityindex.ai

Submodular Batch Selection for Training Deep Neural Networks

Authors: K J Joseph, Vamshi Teja R, Krishnakant Singh, Vineeth N Balasubramanian

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive experiments on standard datasets show that the deep models trained using the proposed batch selection strategy provide better generalization than Stochastic Gradient Descent as well as a popular baseline sampling strategy across different learning rates, batch sizes, and distance metrics.
Researcher Affiliation	Academia	K J Joseph , Vamshi Teja R , Krishnakant Singh , Vineeth N Balasubramanian Indian Institute of Technology, Hyderabad {cs17m18p100001,ee15btech11023,cs15mtech11007,vineethnb}@iith.ac.in
Pseudocode	Yes	Algorithm 1 Algorithm GETMINIBATCH; Algorithm 2 Algorithm SUBMODULAR SGD
Open Source Code	Yes	Source code and supplementary material which includes additional results is available here: https://josephkj.in/projects/SMDL
Open Datasets	Yes	We study the performance on the standard image classiﬁcation task (as used in related earlier efforts) with SVHN [Netzer et al., 2011], CIFAR-10 and CIFAR-100 [Krizhevsky and Hinton, 2009] datasets.
Dataset Splits	No	The paper describes using SVHN, CIFAR-10, and CIFAR-100 datasets and evaluates performance based on 'test accuracy' and 'test loss'. However, it does not explicitly mention or specify a separate validation dataset split (e.g., 'X% training, Y% validation, Z% test') for hyperparameter tuning or early stopping.
Hardware Specification	No	The paper does not provide specific details about the hardware used for the experiments, such as GPU models, CPU types, or memory specifications. It only mentions software like PyTorch.
Software Dependencies	No	The paper mentions 'Py Torch [Paszke et al., 2017]' as a tool used, but it does not specify any version numbers for PyTorch or any other software libraries or dependencies, which is required for reproducibility.
Experiment Setup	Yes	After a grid search and an empirical study, we use the following values for the co-efﬁcients of the terms in the objective function: λ1 = 0.2, λ2 = 0.1, λ3 = 0.5, λ4 = 0.2. All the experiments are run for 100 epochs with a batch size of 50, a momentum parameter of 0.9 and weight decay of 0.0001. We use a refresh rate of 5 for all the experiments. The partition size (m in Algorithm 1 and 2) is set to 10.