Low-Resource Personal Attribute Prediction from Conversations

Authors: Yinan Liu, Hu Chen, Wei Shen, Jiaoyan Chen

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The extensive experimental results show that PEARL outperforms all the baseline methods not only on the task of personal attribute prediction from conversations over two data sets, but also on the more general weakly supervised text classification task over one data set.
Researcher Affiliation Academia Yinan Liu1, Hu Chen1, Wei Shen1 , Jiaoyan Chen2 1 TKLNDST, College of Computer Science, Nankai University, Tianjin 300350, China 2 Department of Computer Science, The University of Manchester {liuyn,2120210473}@mail.nankai.edu.cn, shenwei@nankai.edu.cn, jiaoyan.chen@manchester.ac.uk
Pseudocode Yes Algorithm 1: Iterative BS based Gibbs sampling process
Open Source Code Yes The source code and data sets used in this paper are publicly available2. 2https://github.com/Coding Person/PEARL
Open Datasets Yes For the task of personal attribute prediction from conversations, we perform experiments over two public data sets: (1) profession data set; (2) hobby data set. These two data sets are extracted from publicly-available Reddit submissions and comments (2006 2018), and are annotated and provided by the authors of (Tigunova et al. 2020). The source code and data sets used in this paper are publicly available2. 2https://github.com/Coding Person/PEARL
Dataset Splits No The paper states PEARL operates in a "low-resource setting in which no labeled utterances or external data are utilized" and learns from unlabeled utterances directly. While some baselines mention "ten-fold cross-validation," explicit training/validation splits for PEARL's setup are not provided.
Hardware Specification No The paper mentions that "The experiments are implemented by Mind Spore Framework1" but does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No BERT base-uncased model is adopted as the PLM." and "The experiments are implemented by Mind Spore Framework1." The paper does not provide specific version numbers for these software components.
Experiment Setup Yes The threshold η, the Dirichlet prior β, the number of keywords K for each utterance, the degree of freedom λ, the numbers of iterations E and T are set to 75%, 0.01, 60, 1, 20 and 50 respectively. The Dirichlet prior α is set to 50/g, where g is the number of attribute values. BERT base-uncased model is adopted as the PLM. The number of attribute-related words for each attribute value is set to a minimum of 10 and a maximum of 40.