Community-Based Question Answering via Heterogeneous Social Network Learning
Authors: Hanyin Fang, Fei Wu, Zhou Zhao, Xinyu Duan, Yueting Zhuang, Martin Ester
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on a large-scale dataset from a real world c QA site show that leveraging the heterogeneous social information indeed achieves better performance than other state-of-the-art c QA methods. |
| Researcher Affiliation | Academia | Hanyin Fang, Fei Wu, Zhou Zhao, Xinyu Duan, and Yueting Zhuang College of Computer Science, Zhejiang University and the Key Lab of Big Data Intelligent Computing of Zhejiang Province, China {fhy881229, wufei, zhaozhou, duanxinyu, yzhuang}@zju.edu.cn Martin Ester School of Computing Science, Simon Fraser University, Canada ester@cs.sfu.ca |
| Pseudocode | Yes | Algorithm 1 Heterogeneous Social Network Learning for c QA Input: heterogeneous social network G(V, E), walks per node n, max walk length t, number of iterations m 1: Pre-train the user embedding matrix by Deep Walk 2: for i = 1 to m do 3: for j = 1 to n do 4: O = shuffle(V ) 5: for each v O do 6: p = Random Walk(G, v, t) 7: Calculate the loss for each node in p 8: end for 9: Accumulate the training loss in Equation (5) 10: Update parameters by SGD 11: end for 12: end for |
| Open Source Code | No | The paper states 'This c QA dataset used in our experiments will be released later.' referring to the dataset, but does not provide any explicit statement or link for the release of the source code for the described methodology. |
| Open Datasets | No | To empirically evaluate and validate our proposed framework HSNL, a dataset is built up by collecting the crowdsourced data from a popular high-quality community-based question answering system, Quora, and the social relation information from the famous social network site, Twitter... This c QA dataset used in our experiments will be released later. |
| Dataset Splits | Yes | The dataset is split into training set, validation set and testing set without overlapping in our experiments. The size of validation set is fixed as 10% to tune the hyperparameters and the size of training set varies from 20% to 80%. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running its experiments, such as exact GPU/CPU models, processor types, or memory specifications. |
| Software Dependencies | No | The paper mentions various models and algorithms like LSTM, Deep Walk, and Ada Grad, but does not specify any software libraries or frameworks with version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.x') that would be necessary to replicate the experiment. |
| Experiment Setup | Yes | The paper describes specific hyper-parameters such as 'the hyper-parameter 0 < m < 1 controls the margin in training', 'the parameter α to scale the margin in the ranking based loss in Equation (4). The α is set to 1 when a is a mismatching answer or 0.5 r 2 (r++0.001) for a low-quality answer a', and 'λ > 0 is a hyper-parameter to trade-off the training loss and regularization'. It also mentions 'stochastic gradient descent (SGD) with the diagonal variant of Ada Grad' and 'ρ is the initial learning rate'. |