reproducibilityindex.ai

Community-Based Question Answering via Heterogeneous Social Network Learning

Authors: Hanyin Fang, Fei Wu, Zhou Zhao, Xinyu Duan, Yueting Zhuang, Martin Ester

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on a large-scale dataset from a real world c QA site show that leveraging the heterogeneous social information indeed achieves better performance than other state-of-the-art c QA methods.
Researcher Affiliation	Academia	Hanyin Fang, Fei Wu, Zhou Zhao, Xinyu Duan, and Yueting Zhuang College of Computer Science, Zhejiang University and the Key Lab of Big Data Intelligent Computing of Zhejiang Province, China {fhy881229, wufei, zhaozhou, duanxinyu, yzhuang}@zju.edu.cn Martin Ester School of Computing Science, Simon Fraser University, Canada ester@cs.sfu.ca
Pseudocode	Yes	Algorithm 1 Heterogeneous Social Network Learning for c QA Input: heterogeneous social network G(V, E), walks per node n, max walk length t, number of iterations m 1: Pre-train the user embedding matrix by Deep Walk 2: for i = 1 to m do 3: for j = 1 to n do 4: O = shuffle(V ) 5: for each v O do 6: p = Random Walk(G, v, t) 7: Calculate the loss for each node in p 8: end for 9: Accumulate the training loss in Equation (5) 10: Update parameters by SGD 11: end for 12: end for
Open Source Code	No	The paper states 'This c QA dataset used in our experiments will be released later.' referring to the dataset, but does not provide any explicit statement or link for the release of the source code for the described methodology.
Open Datasets	No	To empirically evaluate and validate our proposed framework HSNL, a dataset is built up by collecting the crowdsourced data from a popular high-quality community-based question answering system, Quora, and the social relation information from the famous social network site, Twitter... This c QA dataset used in our experiments will be released later.
Dataset Splits	Yes	The dataset is split into training set, validation set and testing set without overlapping in our experiments. The size of validation set is ﬁxed as 10% to tune the hyperparameters and the size of training set varies from 20% to 80%.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running its experiments, such as exact GPU/CPU models, processor types, or memory specifications.
Software Dependencies	No	The paper mentions various models and algorithms like LSTM, Deep Walk, and Ada Grad, but does not specify any software libraries or frameworks with version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.x') that would be necessary to replicate the experiment.
Experiment Setup	Yes	The paper describes specific hyper-parameters such as 'the hyper-parameter 0 < m < 1 controls the margin in training', 'the parameter α to scale the margin in the ranking based loss in Equation (4). The α is set to 1 when a is a mismatching answer or 0.5 r 2 (r++0.001) for a low-quality answer a', and 'λ > 0 is a hyper-parameter to trade-off the training loss and regularization'. It also mentions 'stochastic gradient descent (SGD) with the diagonal variant of Ada Grad' and 'ρ is the initial learning rate'.