TIME: Text and Image Mutual-Translation Adversarial Networks

Authors: Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard de Melo, Ahmed Elgammal2082-2090

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments, TIME achieves state-of-the-art (SOTA) performance on the CUB dataset (Inception Score of 4.91 and Fréchet Inception Distance of 14.3 on CUB), and shows promising performance on MS-COCO dataset on image captioning and downstream vision-language tasks. Experiments are conducted on two datasets: CUB (Welinder et al. 2010) and MS-COCO (Lin et al. 2014). We follow the same convention as in previous T2I works to split the training/testing set. We benchmark the image quality by the Inception Score (IS) (Salimans et al. 2016) and Fréchet Inception Distance (FID) (Heusel et al. 2017), and measure the image text consistency by R-precision (Xu et al. 2018) and SOA-C (Hinz, Heinrich, and Wermter 2019).
Researcher Affiliation Academia Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard de Melo, Ahmed Elgammal Department of Computer Science, Rutgers University {bingchen.liu, kunpeng.song, yizhe.zhu}@rutgers.edu gdm@demelo.org, elgammal@cs.rutgers.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It provides mathematical equations but no pseudocode.
Open Source Code No The paper mentions 'online appendix' for further details but does not provide a concrete access link or explicit statement about the release of source code for the described methodology.
Open Datasets Yes Experiments are conducted on two datasets: CUB (Welinder et al. 2010) and MS-COCO (Lin et al. 2014).
Dataset Splits No The paper states 'We follow the same convention as in previous T2I works to split the training/testing set.' and refers to 'early-stop' and 'late-begin' which imply a validation set, but it does not specify the validation dataset split (e.g., percentages or counts) needed for reproducibility.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It mentions training for 600 epochs, but no hardware.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers.
Experiment Setup No The paper mentions training for '600 epochs' and describes an 'annealing conditional hinge loss' as a technique, but it does not provide specific hyperparameters like learning rate, batch size, or optimizer settings for the overall experimental setup.