reproducibilityindex.ai

Unsupervised Attributed Multiplex Network Embedding

Authors: Chanyoung Park, Donghyun Kim, Jiawei Han, Hwanjo Yu5371-5378

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive experiments demonstrate that our proposed method, Deep Multplex Graph Infomax (DMGI), outperforms the state-of-the-art attributed multiplex network embedding methods in terms of node clustering, similarity search, and especially, node classiﬁcation even though DMGI is fully unsupervised. and 4 Experiments Dataset. To make fair comparisons with HAN (Wang et al. 2019), which is the most relevant baseline method, we evaluate our proposed method on the datasets used in their original paper (Wang et al. 2019), i.e., ACM, DBLP, and IMDB. We used publicly available ACM dataset (Wang et al. 2019), and preprocessed DBLP and IMDB datasets. For ACM and DBLP datasets, the task is to classify the papers into three classes (Database, Wireless Communication, Data Mining), and four classes (DM, AI, CV, NLP)1, respectively, according to the research topic. For IMDB dataset, the task is to classify the movies into three classes (Action, Comedy, Drama). We note that the above datasets used by previous work are not truly multiplex in nature because the multiplexity between nodes is inferred via intermediate nodes (e.g., ACM: Paper-Paper relationships are inferred via Authors and Subjects that connect two Papers. i.e., PAP and PSP ). Thus, to make our evaluation more practical, we used Amazon dataset (He and Mc Auley 2016) that genuinely contains a multiplex network of items, i.e., alsoviewed, also-bought, and bought-together relations between items. We used datasets from four categories2, i.e., Beauty, Automotive, Patio Lawn and Garden, and Baby, and the task is to classify items into the four classes. For ACM and IMDB datasets, we used the same number of labeled data as in (Wang et al. 2019) for fair comparisons, and for the remaining datasets, we used 20 labeled data for each class. Table 1 summarizes the data statistics. Methods Compared. 1) Embedding methods for a single network [...] Table 3 and Table 4 show the evaluation results on unsupervised and supervised task, respectively.
Researcher Affiliation	Collaboration	Chanyoung Park,1 Donghyun Kim,2 Jiawei Han,1 Hwanjo Yu3 1Department of Computer Science, University of Illinois at Urbana-Champaign, IL, USA 2Yahoo! Research, CA, USA 3Department of Computer Science and Engineering, Pohang University of Science and Technology, Korea {pcy1302, hanj}@illinois.edu, donghyun.kim@verizonmedia.com, hwanjoyu@postech.ac.kr
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	We implement DMGI in Py Torch3, and for all other methods, we used the source codes published by the authors, and tried to tune them to their best performance. More precisely, apart from the guidelines provided by the original papers, we tuned learning rate, and the coefﬁcients for regularization from {0.0001,0.0005,0.001,0.005} on the validation dataset. After learning the node embeddings, for fair comparisons, we conducted the evaluations within the same platform. 3https://github.com/pcy1302/DMGI
Open Datasets	Yes	To make fair comparisons with HAN (Wang et al. 2019), which is the most relevant baseline method, we evaluate our proposed method on the datasets used in their original paper (Wang et al. 2019), i.e., ACM, DBLP, and IMDB. We used publicly available ACM dataset (Wang et al. 2019), and preprocessed DBLP and IMDB datasets. [...] Thus, to make our evaluation more practical, we used Amazon dataset (He and Mc Auley 2016) that genuinely contains a multiplex network of items...
Dataset Splits	Yes	We randomly split our dataset into train/validation/test, and we have the equal number of labeled data for training and validation datasets. We report the test performance when the performance on validation data gives the best result.
Hardware Specification	No	The paper does not provide any specific hardware details used for running its experiments.
Software Dependencies	No	The paper states 'We implement DMGI in Py Torch3' but does not specify a version number for PyTorch or any other software dependencies with version numbers.
Experiment Setup	Yes	For DMGI, we set the node embedding dimension d = 64, self-connection weight w = 3, tune α, β, γ {0.0001, 0.001, 0.01, 0.1}. We implement DMGI in Py Torch3, and for all other methods, we used the source codes published by the authors, and tried to tune them to their best performance. More precisely, apart from the guidelines provided by the original papers, we tuned learning rate, and the coefﬁcients for regularization from {0.0001,0.0005,0.001,0.005} on the validation dataset. After learning the node embeddings, for fair comparisons, we conducted the evaluations within the same platform.