LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts

Authors: Shuming Ma, Lei Cui, Damai Dai, Furu Wei, Xu Sun6810-6817

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we construct a large-scale live comment dataset with 2,361 videos and 895,929 live comments. Then, we introduce two neural models to generate live comments based on the visual and textual contexts, which achieve better performance than previous neural baselines such as the sequence-to-sequence model. Finally, we provide a retrieval-based evaluation protocol for automatic live commenting where the model is asked to sort a set of candidate comments based on the log-likelihood score, and evaluated on metrics such as mean-reciprocal-rank.
Researcher Affiliation Collaboration Shuming Ma,1, Lei Cui,2 Damai Dai,1 Furu Wei,2 Xu Sun1 1MOE Key Lab of Computational Linguistics, School of EECS, Peking University 2Microsoft Research Asia {shumingma,daidamai,xusun}@pku.edu.cn {lecu,fuwei}@microsoft.com
Pseudocode No The paper includes architectural diagrams (Figure 5 and Figure 6) but no pseudocode or algorithm blocks.
Open Source Code Yes The datasets and the codes can be found at https://github.com/lancopku/livebot.
Open Datasets Yes We construct a large-scale live comment dataset with 2,361 videos and 895,929 comments from a popular Chinese video streaming website called Bilibili. ... The datasets and the codes can be found at https://github.com/lancopku/livebot.
Dataset Splits Yes To split the dataset into training, development and testing sets, we separate the live comments according to the corresponding videos. ... We split the data into 2,161, 100 and 100 videos in the training, testing and development sets, respectively. Finally, the numbers of live comments are 818,905, 42,405, and 34,609 in the training, testing, development sets.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models) used for experiments.
Software Dependencies No For the encoding CNN, we use a pretrained resnet with 18 layers provided by the Pytorch package. While Pytorch is mentioned, a specific version number is not provided, nor are other software dependencies with versions.
Experiment Setup Yes For both models, the batch size is 64, and the hidden dimension is 512. We use the Adam (Kingma and Ba 2014) optimization method to train our models. For the hyper-parameters of Adam optimizer, we set the learning rate α = 0.0003, two momentum parameters β1 = 0.9 and β2 = 0.999 respectively, and ϵ = 1 10 8.