Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Data Augmented Graph Neural Networks for Personality Detection
Authors: Yangfu Zhu, Yue Xia, Meiling Li, Tingting Zhang, Bin Wu
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three real-world datasets, Youtube, PAN2015, and My Personality demonstrate the effectiveness of our Semi-Per GCN in personality detection, especially in scenarios with limited labeled users. |
| Researcher Affiliation | Academia | Beijing University of Posts and Telecommunications, Beijing, China zhuyangfu,meilinglee,zhangtingting,EMAIL, EMAIL |
| Pseudocode | No | The paper describes the model architecture and equations (e.g., Xk+1 = σ(AXkWk), L = Ld + λLc) but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statement about releasing source code or providing a link to a code repository for the described methodology. |
| Open Datasets | Yes | We conduct experiments on the Youtube Personality (Biel et al. 2013), PAN2015 (Rangel Pardo et al. 2015), and Mypersonality datasets (Celli et al. 2013; Xue et al. 2018) with Big Five taxonomy. |
| Dataset Splits | Yes | All the hyperparameters are tuned over the validation set to obtain the optimized results. |
| Hardware Specification | Yes | We use Pytorch to implement all the deep learning models on our three 2080Ti GPU cards. |
| Software Dependencies | No | The paper mentions using 'Pytorch' and 'bert-base-cased' but does not specify version numbers for these software dependencies or any other libraries. |
| Experiment Setup | Yes | Empirically, we use a batch size of 16,16, and 64 for the labeled data and a batch size of 32, 32, and 112 for the unlabeled data in Youtube, PAN2015, and My Personality datasets respectively. Adam is utilized as the optimizer and the learning rate of our model is set to 0.0001, 0.0003, and 0.0003 in PAN2015, Youtube, and My Personality datasets respectively. The pre-trained language models BERT are employed to initialize the word node embeddings by the bert-base-cased (Devlin et al. 2018), and the dimensions of word nodes, LIWC nodes, and user nodes are set to 200. |