reproducibilityindex.ai

Ladder Loss for Coherent Visual-Semantic Embedding

Authors: Mo Zhou, Zhenxing Niu, Le Wang, Zhanning Gao, Qilin Zhang, Gang Hua13050-13057

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on multiple datasets validate the efﬁcacy of our proposed method, which achieves signiﬁcant improvement over existing state-of-the-art methods.
Researcher Affiliation	Collaboration	1Xidian University, 2Alibaba Group, 3Xi an Jiaotong University, 4HERE Technologies, 5Wormpex AI Research
Pseudocode	No	The paper describes its approach and loss functions using mathematical formulations and textual explanations, but does not include a dedicated pseudocode or algorithm block.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	Following related works, Flickr30K (Plummer et al. 2015) and MS-COCO (Lin et al. 2014; Chen et al. 2015) datasets are used in our experiments.
Dataset Splits	Yes	For Flickr30K, we use 1, 000 images for validation, 1, 000 for testing and the rest for training, which is consistent with (Faghri et al. 2018). For MS-COCO, we also follow (Faghri et al. 2018) and use 5, 000 images for both validation and testing. Meanwhile, the rest 30, 504 images in original validation set are used for training (113, 287 training images in total) in our experiments following (Faghri et al. 2018).
Hardware Specification	Yes	The BERT inference is highly computational expensive (e.g., a single NVIDIA Titan Xp GPU could compute similarity score for only approximately 65 sentence pairs per second).
Software Dependencies	No	The paper mentions software like PyTorch, BERT, CBoW, GloVe, and Adam solver, but does not provide specific version numbers for any of these components.
Experiment Setup	Yes	The dimension of the GRU and the joint embedding space is set at D = 1024. The dimension of the word embeddings used as input to the GRU is set to 300. Additionally, Adam solver is used for optimization, with the learning rate set at 2e-4 for 15 epochs, and then decayed to 2e-5 for another 15 epochs. We use a mini-batch of size 128 in all experiments in this paper. ...the threshold θ1 for splitting N q 1 and N q 2 is ﬁxed at 0.63, and the margins α1 = 0.2, α2 = 0.01, the loss weights β1 = 1, β2 = 0.25.