reproducibilityindex.ai

Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification

Authors: Renchun You, Zhiyao Guo, Lei Cui, Xiang Long, Yingze Bao, Shilei Wen12709-12716

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on two multi-label image classiﬁcation datasets (MS-COCO and NUS-WIDE) show our method outperforms other existing state-of-the-arts. In addition, we validate our method on a large multi-label video classiﬁcation dataset (You Tube-8M Segments) and the evaluation results demonstrate the generalization capability of our method. 4 Experiments To assess our model, we perform experiments on two benchmark multi-label image recognition datasets (MS-COCO (Lin et al. 2014) and NUS-WIDE (Chua et al. 2009)) . We also validate the effectiveness of our model on one multi-label video recognition dataset (You Tube-8M Segments) , and the results demonstrate the extensibility of our method.
Researcher Affiliation	Collaboration	Renchun You,1 Zhiyao Guo,2 Lei Cui,3 Xiang Long,1 Yingze Bao,1 Shilei Wen1 1Baidu VIS 2Computer Science Department, Xiamen University, China 3Department of Computer Science and Technology, Tsinghua University, China {yourenchun, longxiang, wenshilei}@baidu.com, {guozhiyao45, baoyingze}@gmail.com, cuil19@mails.tsinghua.edu.cn
Pseudocode	No	The paper describes methods using text and mathematical equations but does not include any labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code or provide a link to a code repository for the methodology described.
Open Datasets	Yes	We perform experiments on two benchmark multi-label image recognition datasets (MS-COCO (Lin et al. 2014) and NUS-WIDE (Chua et al. 2009)) . We also validate the effectiveness of our model on one multi-label video recognition dataset (You Tube-8M Segments).
Dataset Splits	No	The paper specifies training and testing splits for MS-COCO and NUS-WIDE (e.g., '82,081 images for training and 40,137 images for testing' for MS-COCO), but does not explicitly mention a separate validation set split or its size.
Hardware Specification	No	The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments. It mentions using 'Res Net-101 network' and 'Inception network' but these refer to models/architectures, not hardware.
Software Dependencies	No	The paper does not explicitly list any software dependencies with specific version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific libraries with their versions).
Experiment Setup	Yes	In ASGE module, the dimensions of the three hidden layers and label embeddings are all set as 256. The optimizer is Stochastic Gradient Descent (SGD) with momentum 0.9 and the initial learning rate is 0.01. The batch size is set as 64. The optimizer is SGD with momentum 0.9. Weight decay is 10 5. The initial learning rate is 0.01 and decays by a factor 10 every 30 epochs. And the hyperparameter β in the Eq.12 is 0. in MS-COCO dataset and 0.4 in NUS-WIDE dataset. For the training of classiﬁcation, the initial learning rate is 0.0002 and decay each 2 106 samples with momentum 0.8 . The hyperparameter β in the Eq.12 is 0. The optimizer is SGD with momentum 0.9. The batch size is 256.