Unsupervised Attributed Multiplex Network Embedding
Authors: Chanyoung Park, Donghyun Kim, Jiawei Han, Hwanjo Yu5371-5378
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments demonstrate that our proposed method, Deep Multplex Graph Infomax (DMGI), outperforms the state-of-the-art attributed multiplex network embedding methods in terms of node clustering, similarity search, and especially, node classification even though DMGI is fully unsupervised. and 4 Experiments Dataset. To make fair comparisons with HAN (Wang et al. 2019), which is the most relevant baseline method, we evaluate our proposed method on the datasets used in their original paper (Wang et al. 2019), i.e., ACM, DBLP, and IMDB. We used publicly available ACM dataset (Wang et al. 2019), and preprocessed DBLP and IMDB datasets. For ACM and DBLP datasets, the task is to classify the papers into three classes (Database, Wireless Communication, Data Mining), and four classes (DM, AI, CV, NLP)1, respectively, according to the research topic. For IMDB dataset, the task is to classify the movies into three classes (Action, Comedy, Drama). We note that the above datasets used by previous work are not truly multiplex in nature because the multiplexity between nodes is inferred via intermediate nodes (e.g., ACM: Paper-Paper relationships are inferred via Authors and Subjects that connect two Papers. i.e., PAP and PSP ). Thus, to make our evaluation more practical, we used Amazon dataset (He and Mc Auley 2016) that genuinely contains a multiplex network of items, i.e., alsoviewed, also-bought, and bought-together relations between items. We used datasets from four categories2, i.e., Beauty, Automotive, Patio Lawn and Garden, and Baby, and the task is to classify items into the four classes. For ACM and IMDB datasets, we used the same number of labeled data as in (Wang et al. 2019) for fair comparisons, and for the remaining datasets, we used 20 labeled data for each class. Table 1 summarizes the data statistics. Methods Compared. 1) Embedding methods for a single network [...] Table 3 and Table 4 show the evaluation results on unsupervised and supervised task, respectively. |
| Researcher Affiliation | Collaboration | Chanyoung Park,1 Donghyun Kim,2 Jiawei Han,1 Hwanjo Yu3 1Department of Computer Science, University of Illinois at Urbana-Champaign, IL, USA 2Yahoo! Research, CA, USA 3Department of Computer Science and Engineering, Pohang University of Science and Technology, Korea {pcy1302, hanj}@illinois.edu, donghyun.kim@verizonmedia.com, hwanjoyu@postech.ac.kr |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | We implement DMGI in Py Torch3, and for all other methods, we used the source codes published by the authors, and tried to tune them to their best performance. More precisely, apart from the guidelines provided by the original papers, we tuned learning rate, and the coefficients for regularization from {0.0001,0.0005,0.001,0.005} on the validation dataset. After learning the node embeddings, for fair comparisons, we conducted the evaluations within the same platform. 3https://github.com/pcy1302/DMGI |
| Open Datasets | Yes | To make fair comparisons with HAN (Wang et al. 2019), which is the most relevant baseline method, we evaluate our proposed method on the datasets used in their original paper (Wang et al. 2019), i.e., ACM, DBLP, and IMDB. We used publicly available ACM dataset (Wang et al. 2019), and preprocessed DBLP and IMDB datasets. [...] Thus, to make our evaluation more practical, we used Amazon dataset (He and Mc Auley 2016) that genuinely contains a multiplex network of items... |
| Dataset Splits | Yes | We randomly split our dataset into train/validation/test, and we have the equal number of labeled data for training and validation datasets. We report the test performance when the performance on validation data gives the best result. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper states 'We implement DMGI in Py Torch3' but does not specify a version number for PyTorch or any other software dependencies with version numbers. |
| Experiment Setup | Yes | For DMGI, we set the node embedding dimension d = 64, self-connection weight w = 3, tune α, β, γ {0.0001, 0.001, 0.01, 0.1}. We implement DMGI in Py Torch3, and for all other methods, we used the source codes published by the authors, and tried to tune them to their best performance. More precisely, apart from the guidelines provided by the original papers, we tuned learning rate, and the coefficients for regularization from {0.0001,0.0005,0.001,0.005} on the validation dataset. After learning the node embeddings, for fair comparisons, we conducted the evaluations within the same platform. |