reproducibilityindex.ai

TransNet: Translation-Based Network Representation Learning for Social Relation Extraction

Authors: Cunchao Tu, Zhengyan Zhang, Zhiyuan Liu, Maosong Sun

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on SRE demonstrate that Trans Net signiﬁcantly outperforms other baseline methods by 10% to 20% on hits@1. The source code and datasets can be obtained from https: //github.com/thunlp/Trans Net.
Researcher Affiliation	Academia	Cunchao Tu1,2, Zhengyan Zhang1, Zhiyuan Liu1,2 , Maosong Sun1,2 1Department of Computer Science and Technology, State Key Lab on Intelligent Technology and Systems, National Lab for Information Science and Technology, Tsinghua University, China 2Jiangsu Collaborative Innovation Center for Language Ability, Jiangsu Normal University, China {tcc13, zhangzhengyan14}@mails.tsinghua.edu.cn, liuzy@tsinghua.edu.cn, sms@mail.tsinghua.edu.cn
Pseudocode	No	No pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	The source code and datasets can be obtained from https: //github.com/thunlp/Trans Net.
Open Datasets	Yes	The source code and datasets can be obtained from https: //github.com/thunlp/Trans Net. Firstly, we collect all the research interest phrases from the author proﬁles and build the label vocabulary with these phrases. These phrases are mainly crawled from the authors personal home pages and annotated by themselves. Hence, these phrases are rather credible, which is also conﬁrmed by our manual check. Secondly, for each co-author relationship, we ﬁlter out the in-vocabulary labels in the abstracts of coauthored papers and regard them as the ground truth labels of this edge. Note that, as the edges in co-author networks are undirected, we replace each edge with two directed edges with opposite directions. Speciﬁcally, to better investigate the characteristics of different models, we construct three datasets with different scales, denoted as Arnet-S(small), Arnet-M(medium) and Arnet-L(large). The details are shown in Table 1. Table 1: Datasets. (ML indicates multi-label edges.) Datasets Arnet-S Arnet-M Arnet-L Vertices 187, 939 268, 037 945, 589 Edges 1, 619, 278 2, 747, 386 5, 056, 050 Train 1, 579, 278 2, 147, 386 3, 856, 050 Test 20, 000 300, 000 600, 000 Valid 20, 000 300, 000 600, 000 Labels 100 500 500 ML Proportion (%) 42.46 63.74 61.68
Dataset Splits	Yes	Table 1: Datasets. (ML indicates multi-label edges.) Datasets Arnet-S Arnet-M Arnet-L Vertices 187, 939 268, 037 945, 589 Edges 1, 619, 278 2, 747, 386 5, 056, 050 Train 1, 579, 278 2, 147, 386 3, 856, 050 Test 20, 000 300, 000 600, 000 Valid 20, 000 300, 000 600, 000 Labels 100 500 500 ML Proportion (%) 42.46 63.74 61.68
Hardware Specification	No	No specific hardware details (e.g., CPU, GPU models, memory, cloud instances) used for experiments were mentioned in the paper.
Software Dependencies	No	The paper mentions using the "Adam algorithm" and "dropout" but does not provide specific version numbers for any software libraries or dependencies (e.g., Python, TensorFlow, PyTorch, scikit-learn versions).
Experiment Setup	Yes	We set the representation dimension to 100 for all models. In Trans Net, we set the regularizer weight η to 0.001, the learning rate to 0.001 and the margin γ to 1. Besides, we employ a 2-layer autoencoder for all datasets and select bestperformed hyper-parameters α and β on validation sets. At last, we adopt Adam algorithm [Kingma and Ba, 2015] to minimize the objective in Eq. (7). In order to prevent overﬁtting, we also employ dropout [Srivastava et al., 2014] to generate the edge representations.