Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-View Visual Semantic Embedding

Authors: Zheng Li, Caili Guo, Zerun Feng, Jenq-Neng Hwang, Xijun Xue

IJCAI 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on the Flickr30K and MS-COCO datasets demonstrate the superior performance of our framework.
Researcher Affiliation Collaboration 1Beijing Key Laboratory of Network System Architecture and Convergence, Beijing University of Posts and Telecommunications 2Beijing Laboratory of Advanced Information Networks, Beijing University of Posts and Telecommunications 3University of Washington 4China Telecom System Integration Co.,Ltd EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes its framework and methods but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes We evaluate our method on two standard benchmarks: Flickr30K [Young et al., 2014] and MS-COCO [Lin et al., 2014].
Dataset Splits Yes Flickr30K dataset contains 31,000 images, each image is annotated with 5 sentences. Following the data split of [Faghri et al., 2018], we use 1,000 images for validation, 1,000 images for testing, and the remaining for training. MS-COCO dataset contains 123,287 images, and each image comes with 5 sentences. We mirror the data split setting of [Faghri et al., 2018]. More specifically, we use 113,287 images for training, 5,000 images for validation, and 5,000 images for testing. We report results on both 1,000 test images (averaged over 5 folds) and the full 5,000 test images.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions software components like Res Net, Faster R-CNN, Bi GRU, BERT-base, and Sentence-BERT, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Parameters are set as K = 3, λ = 0.7, for both Flickr30K and MS-COCO.