Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
TiMix: Text-Aware Image Mixing for Effective Vision-Language Pre-training
Authors: Chaoya Jiang, Wei Ye, Haiyang Xu, Qinghao Ye, Ming Yan, Ji Zhang, Shikun Zhang
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results demonstrate that Ti Mix exhibits a comparable performance on downstream tasks, even with a reduced amount of training data and shorter training time, when benchmarked against existing methods. |
| Researcher Affiliation | Collaboration | 1National Engineering Research Center for Software Engineering, Peking University, Beijing, China 2Alibaba Group, Hangzhou, China |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available on https://github.com/chaoyajiang/Ti Mi X/tree/main. |
| Open Datasets | Yes | Following the previous works (Li et al. 2021) and (Li et al. 2022a), we use the same pre-training dataset with 14M images with texts, which includes two in-domain datasets (MS COCO (Lin et al. 2014) and Visual Genome (Krishna et al. 2016)), and three web out-domain datasets (Conceptual Captions (Sharma et al. 2018a), Conceptual 12M (Changpinyo et al. 2021a), SBU Captions (Ordonez, Kulkarni, and Berg 2011)). |
| Dataset Splits | Yes | We evaluated our models by submitting the results to the evaluation server 1 and report the test-dev and test-std scores in Table 1. The fine-tuning hyper-parameters and the details of downstream tasks are described in Appendix D. Tables 1, 2, and 3 use standard splits like 'dev', 'test-dev', 'test-std', and 'COCO Karpathy test split'. |
| Hardware Specification | Yes | on 8 80G A100 |
| Software Dependencies | No | The paper mentions specific models and loss functions, but does not provide version numbers for any software dependencies like programming languages, frameworks, or libraries. |
| Experiment Setup | No | The paper states that 'The fine-tuning hyper-parameters and the details of downstream tasks are described in Appendix D' and 'Please refer to Appendix C to see more detail about the pre-training dataset and pre-training setting.' However, these specific details are not present in the main body of the text provided. |