reproducibilityindex.ai

Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints

Authors: Mengtian Li, Ersin Yumer, Deva Ramanan

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We support our claim through extensive experiments with state-of-the-art models on Image Net (image classiﬁcation), Kinetics (video classiﬁcation), MS COCO (object detection and instance segmentation), and Cityscapes (semantic segmentation).
Researcher Affiliation	Collaboration	Mengtian Li Carnegie Mellon University mtli@cs.cmu.edu Ersin Yumer Uber ATG meyumer@gmail.com Deva Ramanan CMU & Argo AI deva@cs.cmu.edu
Pseudocode	No	The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code	No	The paper mentions adapting existing open-source codebases for their experiments (e.g., 'We adapt both the network architecture (Res Net-18) and the data loader from the open source Py Torch Image Net example...'), but it does not state that their own developed methodology or contributions are open-source or provide a link to their specific implementation code.
Open Datasets	Yes	CIFAR-10 (Krizhevsky & Hinton, 2009) is a dataset that contains 70,000 tiny images (32 32). Image Net (Russakovsky et al., 2015) is a widely adopted standard for image classiﬁcation task. MS COCO (Lin et al., 2014) is a widely recognized benchmark for object detection and instance segmentation. Cityscapes (Cordts et al., 2016) is a dataset commonly used for evaluating semantic segmentation algorithms. Kinetics (Kay et al., 2017) is a large-scale dataset of You Tube videos focusing on human actions.
Dataset Splits	Yes	We follow the standard setup for dataset split (Huang et al., 2017b), which is randomly holding out 5,000 from the 50,000 training images to form the validation set.
Hardware Specification	No	The paper mentions the number of GPUs used for training (e.g., 'training using 4 GPUs', 'train with 8 GPUs'), and notes the use of 'asynchronous batch normalization' or 'synchronous batch normalization', but it does not specify the particular models of GPUs (e.g., NVIDIA V100, A100) or any details about CPU or memory specifications.
Software Dependencies	Yes	We adapt both the network architecture (Res Net-18) and the data loader from the open source Py Torch Image Net example5. Py Torch version 0.4.1. We use an open source codebase6 that has training and data processing code publicly available. Caffe 2 version 0.8.1. We use the open source implementation of Mask R-CNN7, which is a Py Torch re-implementation of the ofﬁcial codebase Detectron in the Caffe 2 framework. Py Torch version 0.4.1.
Experiment Setup	Yes	We use Res Net-18 (He et al., 2016) as the backbone architecture and utilize SGD with base learning rate 0.1, momentum 0.9, weight decay 0.0005 and a batch size 128. For training, we adopt the 1x schedule (90k iterations)... We train with 8 GPUs (batch size 16) and keep the built-in learning rate warm up mechanism...