Joint Learning on Relevant User Attributes in Micro-blog
Authors: Jingjing Wang, Shoushan Li, Guodong Zhou
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical studies demonstrate the effectiveness of our proposed approach to joint learning on relevant user attributes. |
| Researcher Affiliation | Academia | Natural Language Processing Lab, School of Computer Science and Technology Soochow University, Suzhou, 215006, China djingwang@gmail.com, {lishoushan, gdzhou}@suda.edu.cn |
| Pseudocode | No | No explicit pseudocode or algorithm blocks were found. The methodology is described using mathematical equations and textual explanations. |
| Open Source Code | No | No explicit statement about the release of their own source code (Aux-LSTM) was found. The provided links refer to third-party tools used in the research, not the authors' implementation. |
| Open Datasets | No | We collect our data set from Tencent Micro-blog, which is one of the most popular SNS websites in China. From this website, we crawl each user s homepage containing user information (e.g. name, profession, age, gender) and the posted messages. The paper describes the creation of their own dataset from a public source, but does not provide public access to their collected dataset. |
| Dataset Splits | Yes | For each kind of user attribute (i.e., profession, gender and age) classification task, we randomly split the users into a training set (80% users) and a test set (20% users). We also set aside 10% users from the training as the validation data which is used to tune learning algorithm parameters. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned in the paper. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., library names with versions) were mentioned, only general tools like 'Adagrad' and 'word2vec'. |
| Experiment Setup | Yes | The dimensionality of word vector is set to be 200. The window size is set as 5. In our Aux-LSTM model, λ is set to be 0.75 in order to reduce the influence of noisy information from auxiliary task. All the matrix and vector parameters are initialized with uniform distribution in [ p 6/(r + c), p 6/(r + c)] In order to avoid over-fitting, the dropout strategy is used in both the LSTM layer and auxiliary LSTM layer. |