reproducibilityindex.ai

Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

Authors: Bo Wan, Wenjuan Han, Zilong Zheng, Tinne Tuytelaars

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We introduce a new evaluation metric: Critical Concept Recall Rate (CCRR) to explicitly evaluate VL grammar induction, and show a 2.6% improvement over a strong baseline on Flickr30k Entities. We also evaluate our model via two derived tasks, i.e., language grammar induction and phrase grounding, and improve over the state-of-the-art for both.
Researcher Affiliation	Collaboration	Bo Wan1, Wenjuan Han2 , Zilong Zheng2, Tinne Tuytelaars1 1. Department of Electrical Engineering, KU Leuven; 2. Beijing Institute for General Artiﬁcial Intelligence, Beijing, China
Pseudocode	No	The paper describes its model and algorithms using text, equations, and diagrams (Figure 2, 6, 7, 8, 9), but it does not include a distinct pseudocode block or a clearly labeled algorithm.
Open Source Code	Yes	Code is available at https://github.com/bobwan1995/cliora.git. All the codes, processed data, and the trained model in this paper are publicly released at https://github.com/bobwan1995/cliora.git.
Open Datasets	Yes	We evaluate our method on the Flickr30k Entities (Plummer et al., 2017) and MSCOCO (Lin et al., 2014) datasets.
Dataset Splits	Yes	Flickr30K Entities contains 29783 images for training, 1,000 images for validation and 1,000 for test. We use the same split of MSCOCO as Zhao & Titov (2020b), which contains 82,783 training images, 1,000 validation images, and 1,000 test images.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions using several tools and libraries like 'Faster R-CNN', 'Ro I-Align', 'ELMo', 'Glove embedding', and 'Benepar', but it does not specify version numbers for any of these software components, nor for broader frameworks like PyTorch or TensorFlow.
Experiment Setup	Yes	We load DIORA as an initialization for CLIORA . Other detailed hyper-parameters are provided in Appendix F. Table 5: λ 0.5 γ 0.5 # Epoch 10 Learning rate 1e 5 Batch Size 64.