reproducibilityindex.ai

DeepME: Deep Mixture Experts for Large-scale Image Classification

Authors: Ming He, Guangyi Lv, Weidong He, Jianping Fan, Guihua Zeng

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results on Image Net10K have demonstrated that our proposed deep mixture algorithm can achieve very competitive results (top 1 accuracy: 32.13%) on large-scale image classiﬁcation tasks. We performed experiments on Image Net10K, which is one of the most well-known image datasets for visual classiﬁcation, and contains 10,184 image categories and 9M images. Furthermore, we use a 85%-5%-10% train/validation/test split.
Researcher Affiliation	Collaboration	Ming He1,2,4, Guangyi Lv2 , Weidong He2, Jianping Fan3, Guihua Zeng 1 1Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai 200240, China 2University of Science and Technology of China, Hefei, Anhui 230000, China 3AI Lab at Lenovo Research, Beijing, China 4Didi Chuxing, Beijing, China
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for the methodology is openly available.
Open Datasets	Yes	We use Image Net10K data set [Deng et al., 2009] with 10,184 image categories to evaluate our deep mixture algorithm on large-scale image classiﬁcation. We performed experiments on Image Net10K, which is one of the most well-known image datasets for visual classiﬁcation, and contains 10,184 image categories and 9M images.
Dataset Splits	Yes	Furthermore, we use a 85%-5%-10% train/validation/test split.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions optimizers like 'Stochastic Gradient Descent (SGD) with momentum 0.9' and initializers like 'Glorot Normal initializer,' but does not provide specific version numbers for any software libraries or dependencies used (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	In Deep ME, the training process consists of two parts as follows: 1. Training of the base CNN. We utilize Stochastic Gradient Descent (SGD) with momentum 0.9 to learn the base network. The training is divided into two stages: 1) The warm-up stage and 2) The ﬁne-tuning stage. In the warm-up stage, the learning rate is set from 0.01 to 0.001 in a exponentially decayed manner, while in the ﬁne-tuning stage it is set from 0.001 to 0.00001. We use batch size 256 and L2 regularization for the corresponding parameters with weight 0.0005. 2. Training of the gate network. To initialized the parameters of the gate network, Glorot Normal initializer is adopted as suggested in [Orr and M uller, 2003]. SGD with momentum is also used to learn the network, and the batch size is 256. The initial learning rate is 0.001 and then exponentially decayed to 0.0001.