Multi-Label Classification with Label Graph Superimposing

Authors: Ya Wang, Dongliang He, Fu Li, Xiang Long, Zhichao Zhou, Jinwen Ma, Shilei Wen12265-12272

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments are carried out on MSCOCO and Charades datasets, showing that our proposed solution can greatly improve the recognition performance and achieves new state-of-the-art recognition performance.
Researcher Affiliation Collaboration $School of Mathematical Sciences and LMAM, Peking University, China Department of Computer Vision Technology (VIS), Baidu Inc., Beijing, China {wangyachn, jwma}@math.pku.edu.cn, {hedongliang01, lifu, longxiang, zhouzhichao01, wenshilei}@baidu.com
Pseudocode No The paper includes block diagrams and mathematical formulations but no explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes MS-COCO (Lin et al. 2014) is a static image dataset... Charades (Sigurdsson et al. 2016) is a multilabel video dataset...
Dataset Splits Yes MS-COCO... It contains about 82K images for training, 41K for validation and 41K for test... Charades... containing around 9.8K videos, among which about 8K for training and 1.8K for validation.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments (e.g., GPU/CPU models, memory).
Software Dependencies No The paper mentions optimizers, activation functions, and pre-trained models but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Adam is used as the optimizer with a momentum of 0.9, weight decay of 10 4 and batch size of 80. The initial learning rate of Adam is 0.01. All models are trained for 100 epochs in total. We train all models with mini-batch size of 16 clips. Adam is used as the optimizer, starting with a momentum of 0.9 and weight decay of 10 4. The weight decays of all bias are set to zero. Dropout (Hinton et al. 2012) with a ratio of 0.5 is added after the average pooled CNN features. The initial learning rate of GCN parameters is set to be 0.001, while others are set to be 10 4. We use the strategy proposed in (He et al. 2015) to initialize the GCN and initial label embeddings are extracted with Concept Net (Speer, Chin, and Havasi 2017).