reproducibilityindex.ai

Dependent Multi-Task Learning with Causal Intervention for Image Captioning

Authors: Wenqing Chen, Jidong Tian, Caoyun Fan, Hao He, Yaohui Jin

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The extensive experiments show that our model outperforms the baseline models and achieves competitive performance with state-of-the-art models.
Researcher Affiliation	Academia	Wenqing Chen1,2 , Jidong Tian1,2 , Caoyun Fan1,2 , Hao He1,2 and Yaohui Jin1,2 1Mo E Key Lab of Artiﬁcial Intelligence, AI Institute, Shanghai Jiao Tong University 2State Key Lab of Advanced Optical Communication System and Network, Shanghai Jiao Tong University
Pseudocode	No	The paper describes its model and approach in detail but does not include a dedicated pseudocode or algorithm block.
Open Source Code	No	The paper does not explicitly state that its source code for the described methodology is available. It mentions 'publicly released code' for metrics and 'ofﬁcially released codes' for reproducing other models, but not its own.
Open Datasets	Yes	We experiment on the MSCOCO dataset, which is the most popular dataset for image captioning. The original dataset contains about 82,783 training images and 40,504 validation images.
Dataset Splits	Yes	Following most of the previous work, we ﬁrst evaluate our model on the Karpathy data split [Karpathy and Li, 2015] with 5,000 images for validation, 5,000 images for testing, and the rest for training.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types) used to run its experiments.
Software Dependencies	No	The paper mentions using 'publicly released code' for metrics but does not provide specific version numbers for any software dependencies or libraries used in its own implementation.
Experiment Setup	Yes	The learning rate is initialized to 0.0001 and decreased by half when the CIDEr-D score does not increase in 2 epochs, with the minimum learning rate set to 5e-6. The batch size is set to 50. The model is ﬁrstly optimized with MLE for 30 epochs (Gumbel sampling fm after 15 epochs), and then optimized with MARL for another 35 epochs.