Accelerated Training for Massive Classification via Dynamic Class Selection
Authors: Xingcheng Zhang, Lei Yang, Junjie Yan, Dahua Lin
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On several large-scale benchmarks, our method significantly reduces the training cost and memory demand, while maintaining competitive performance. We test our method on three benchmarks on face recognition/verification, which is the application that motivates this work. We not only compare it with various methods, but also investigate how different factors influence the performance and cost, via a series of ablation studies. |
| Researcher Affiliation | Collaboration | Xingcheng Zhang,1 Lei Yang,1 Junjie Yan,2 Dahua Lin1 1Department of Information Engineering, The Chinese University of Hong Kong 2Sense Time Group Limited |
| Pseudocode | Yes | Algorithm 1 Build Hashing Tree |
| Open Source Code | No | The paper does not provide an explicit statement about the release of its source code or a link to a code repository for its methodology. |
| Open Datasets | Yes | MS-Celeb-1M (Guo et al. 2016). Megaface (MF2) (Kemelmacher-Shlizerman et al. 2016). LFW (Huang et al. 2007), IJB-A (Klare et al. 2015), and Megaface & Facescrub (Kemelmacher Shlizerman et al. 2016) (Ng and Winkler 2014). |
| Dataset Splits | No | The paper mentions training epochs and monitoring CPK values but does not specify explicit training/validation/test splits with percentages or sample counts. It refers to distinct training and testing sets, but not a validation split from the training data. |
| Hardware Specification | Yes | On the other hand, current GPUs only come with limited memory capacity, e.g. the memory capacity of Tesla P100 is up to 16 GB. The training is done on a server with 8 NVIDIA Titan X GPUs. |
| Software Dependencies | No | The paper mentions software like Hynet and Res Net-101 and the use of SGD with momentum, but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | For all settings, the networks are trained using SGD with momentum. The mini-batch sizes are set to 512 and 256 respectively for Hynet and Res Net-101. We will rebuild the hashing forest every T iterations in order to stay updated. We set M to be the minimum number such that the average top-M cumulative probability is above a threshold τcp. |