reproducibilityindex.ai

Learning from My Friends: Few-Shot Personalized Conversation Systems via Social Networks

Authors: Zhiliang Tian, Wei Bi, Zihan Zhang, Dongkyu Lee, Yiping Song, Nevin L. Zhang13907-13915

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show our methods outperform all baselines in appropriateness, diversity, and consistency with speakers.The results of all competing methods on automatic metrics are shown in Table 1.
Researcher Affiliation	Collaboration	1Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China 2Tencent AI Lab, Shenzhen, China 3National University of Defense Technology, Changsha, China 4HKUST Xiao-i Robot Joint Lab, Hong Kong SAR, China
Pseudocode	Yes	Algorithm 1: Training Algorithm
Open Source Code	No	We release the code of dataset construction 3. (The provided link github.com/tianzhiliang/Few Shot Persona Conv Data is explicitly for dataset construction, not the model's methodology code.)
Open Datasets	Yes	We collect the dataset from Weibo, an online chatting forum with social networks. ... We release the code of dataset construction 3. github.com/tianzhiliang/Few Shot Persona Conv Data
Dataset Splits	Yes	We use 28.9K speakers with 2.02M samples for training, 1K speakers with 20K samples for testing, and 0.5K speakers with 10K samples for validation.
Hardware Specification	No	No specific hardware details (like GPU/CPU models or specific machine configurations) were mentioned for running experiments.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., library names with explicit versions) were mentioned.
Experiment Setup	Yes	Seq2Seq follows Song et al. 2018 where the embedding and hidden dimensions are 620 and 1000. For the transformer-based model, we implement it as the original one (Vaswani et al. 2017), where the model dimension is 512, the stacked layer number is 6, and the head number is 8. ... we used SGD for the inner-loop and Adam for the outer-loop with learning rate α = 0.01 and β = 0.0003, respectively. For all methods, the batch size in training is 128. The vocabulary contains top 50k frequent tokens, and the maximum length of input queries and responses is 80.