Attentive Tensor Product Learning

Authors: Qiuyuan Huang, Li Deng, Dapeng Wu, Chang Liu, Xiaodong He1344-1351

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results demonstrate the effectiveness of the proposed approach in all these three natural language processing tasks. Our evaluation shows that on both image captioning and POS tagging, our approach can outperform previous state-of-the-art approaches. We evaluate our approach with several baselines on the COCO dataset (COCO 2017).
Researcher Affiliation Collaboration Qiuyuan Huang Microsoft Research Redmond, WA, USA qihua@microsoft.com; Li Deng Citadel, USA deng629@gmail.com; Dapeng Wu University of Florida Gainesville, FL, USA dpwu@ieee.org; Chang Liu Citadel Securities Chicago, IL, USA liuchang2005acm@gmail.com; Xiaodong He JD AI Research Beijing, China xiaohe.ai@outlook.com
Pseudocode No The paper describes the architecture and mathematical formulas, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement about releasing open-source code or a link to a code repository for the methodology described.
Open Datasets Yes We evaluate our approach with several baselines on the COCO dataset (COCO 2017). The COCO dataset contains 123,287 images, each of which is annotated with at least 5 captions. We use the same pre-defined splits as (Karpathy and Fei-Fei 2015; Gan et al. 2017): 113,287 images for training, 5,000 images for validation, and 5,000 images for testing. We test it using the Penn Tree Bank dataset (Marcus et al. 2017).
Dataset Splits Yes We use the same pre-defined splits as (Karpathy and Fei-Fei 2015; Gan et al. 2017): 113,287 images for training, 5,000 images for validation, and 5,000 images for testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions that the model is implemented in 'Tensor Flow (Abadi and others 2015)' and uses 'Stanford GLo Ve algorithm', but does not specify version numbers for these or other software dependencies.
Experiment Setup Yes In our ATPL architecture, we choose d = 32, and the size of the LSTM hidden state to be 512. The vocabulary size V = 8, 791. For the CNN of Fig. 2, we used Res Net-152 (He et al. 2016), pretrained on the Image Net dataset. The image feature vector v has 2048 dimensions. The model is implemented in Tensor Flow (Abadi and others 2015) with the default settings for random initialization and optimization by backpropagation.