Aspect-Aware Multimodal Summarization for Chinese E-Commerce Products
Authors: Haoran Li, Peng Yuan, Song Xu, Youzheng Wu, Xiaodong He, Bowen Zhou8188-8195
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results on this dataset demonstrate that our models significantly outperform the comparative methods in terms of both the ROUGE score and manual evaluations. |
| Researcher Affiliation | Industry | JD AI Research {lihaoran24, yuanpeng29, xusong28, wuyouzheng1, xiaodong.he, bowen.zhou}@jd.com |
| Pseudocode | No | The paper describes methods using mathematical equations and textual descriptions, but does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our dataset and code are available1. 1https://github.com/hrlinlp/cepsum |
| Open Datasets | Yes | We construct CEPSUM, a Chinese E-commerce Product SUMmarization dataset that contains approximately 1.4 million manually created product summaries that are paired with detailed product information, including an image, a title, and other textual descriptions for each product. Our dataset and code are available1. 1https://github.com/hrlinlp/cepsum |
| Dataset Splits | Yes | Table 2: Corpus statistics. Category # Train # Valid # Test Home Appliances 437,646 10,000 10,000 Clothing 790,297 10,000 10,000 Cases&Bags 97,510 5,000 5,000 |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, memory, or processing speeds used for running experiments. |
| Software Dependencies | No | The paper mentions 'ROUGE-1.5.5 toolkit' but does not provide other specific software dependencies or library versions (e.g., Python, PyTorch, TensorFlow, CUDA versions) needed to replicate the experiment. |
| Experiment Setup | Yes | We set the sizes of the word embedding and the LSTM hidden state to 300 and 512, respectively. We set the initial learning rate for Adam to 5 × 10−4. The mini-batch size is set to 16. During training, we test ROUGE-2 (Lin 2004) F1 score and perplexity using the development set for every 5,000 batches, and we halve the learning rate if model’s ROUGE-2 score drops for 7 consecutive tests. We first train our models without coverage until they converge using an early stopping strategy, and then we add the coverage mechanism to further train the models. During testing, we use the beam search with a beam size of 10 to generate the summaries, and character-based trigram repetition avoidance (Paulus, Xiong, and Socher 2018) is applied. |