Reinforced Multi-Label Image Classification by Exploring Curriculum

Authors: Shiyi He, Chang Xu, Tianyu Guo, Chao Xu, Dacheng Tao

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on PASCAL VOC2007 and 2012 demonstrate the necessity of reinforcement multi-label learning and the algorithm s effectiveness in real-world multi-label image classification tasks.
Researcher Affiliation Academia Shiyi He,1,3 Chang Xu,2 Tianyu Guo,1,3 Chao Xu,1,3 Dacheng Tao2 1Key Laboratory of Machine Perception (MOE), School of EECS, Peking University, China 2UBTECH Sydney AI Centre, SIT, FEIT, University of Sydney, Australia 3Cooperative Medianet Innovation Center, Peking University, China
Pseudocode Yes Algorithm 1 Deep Q-learning for RMIC
Open Source Code No The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper.
Open Datasets Yes The proposed algorithm was evaluated on the PASCAL VOC2007 and PASCAL VOC2012 datasets (Everingham et al. 2010).
Dataset Splits Yes These two databases contain 9,963 and 22,531 images respectively, and are divided into train,val and test subsets. We merge the train set with val set into trainval set and conduct our experiments on the trainval / test splits (5,011/4,952 for VOC 2007 and 11,540/10,991 for VOC2012).
Hardware Specification Yes The algorithm was implemented on the publicly available Keras platform on a single NVIDIA Ge Force Titan X GPU with 12GB memory.
Software Dependencies No The paper mentions implementation on 'Keras platform' but does not provide specific version numbers for Keras or any other software dependencies.
Experiment Setup Yes The output layer of Q-network is a linear layer with a single output for each valid action or label. Since there are 20 categories in VOC database, the three fully-connected layers neurons of deep Q-network were set as 512, 128 and 20, respectively. Each action was represented by a 19-dimensional vector and the action history h encoded 2 past actions. We trained the network for 3 epochs and each epoch was ended after the agent had interacted with all training images. During the ϵ-greedy training, was annealed linearly set from 1 to 0.2 over the first 2 epochs to progressively allow the agent to use its own learned model. Then ϵ was fixed to 0.2 in the last epoch, so the agent further adjusted its network parameters. The mini batch size was set to 32.