reproducibilityindex.ai

LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts

Authors: Shuming Ma, Lei Cui, Damai Dai, Furu Wei, Xu Sun6810-6817

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we construct a large-scale live comment dataset with 2,361 videos and 895,929 live comments. Then, we introduce two neural models to generate live comments based on the visual and textual contexts, which achieve better performance than previous neural baselines such as the sequence-to-sequence model. Finally, we provide a retrieval-based evaluation protocol for automatic live commenting where the model is asked to sort a set of candidate comments based on the log-likelihood score, and evaluated on metrics such as mean-reciprocal-rank.
Researcher Affiliation	Collaboration	Shuming Ma,1, Lei Cui,2 Damai Dai,1 Furu Wei,2 Xu Sun1 1MOE Key Lab of Computational Linguistics, School of EECS, Peking University 2Microsoft Research Asia {shumingma,daidamai,xusun}@pku.edu.cn {lecu,fuwei}@microsoft.com
Pseudocode	No	The paper includes architectural diagrams (Figure 5 and Figure 6) but no pseudocode or algorithm blocks.
Open Source Code	Yes	The datasets and the codes can be found at https://github.com/lancopku/livebot.
Open Datasets	Yes	We construct a large-scale live comment dataset with 2,361 videos and 895,929 comments from a popular Chinese video streaming website called Bilibili. ... The datasets and the codes can be found at https://github.com/lancopku/livebot.
Dataset Splits	Yes	To split the dataset into training, development and testing sets, we separate the live comments according to the corresponding videos. ... We split the data into 2,161, 100 and 100 videos in the training, testing and development sets, respectively. Finally, the numbers of live comments are 818,905, 42,405, and 34,609 in the training, testing, development sets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models) used for experiments.
Software Dependencies	No	For the encoding CNN, we use a pretrained resnet with 18 layers provided by the Pytorch package. While Pytorch is mentioned, a specific version number is not provided, nor are other software dependencies with versions.
Experiment Setup	Yes	For both models, the batch size is 64, and the hidden dimension is 512. We use the Adam (Kingma and Ba 2014) optimization method to train our models. For the hyper-parameters of Adam optimizer, we set the learning rate α = 0.0003, two momentum parameters β1 = 0.9 and β2 = 0.999 respectively, and ϵ = 1 10 8.