Robust Graph Structure Learning via Multiple Statistical Tests

Authors: Yaohua Wang, Fangyi Zhang, Ming Lin, Senzhang Wang, Xiuyu Sun, Rong Jin

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The effectiveness of multiple tests for graph structure learning is verified both theoretically and empirically on multiple clustering and Re ID benchmark datasets. Experiments are designed with three levels. Firstly, experiments are conducted to analyse the advantage of Sim-M over that of Sim-S and its robustness to noise. Secondly, armed with multiple tests on GCNs, i.e., B-Attention is further investigated in comparison with the self-attention and a commonly used form of GCNs on robustness and superiority.
Researcher Affiliation Collaboration Yaohua Wang Alibaba Group xiachen.wyh@alibaba-inc.com Fangyi Zhang * Queensland University of Technology Centre for Robotics (QCR) fangyi.zhang@qut.edu.au Ming-Ming Cheng Amazon minglamz@amazon.com Senzhang Wang Central South University szwang@csu.edu.cn Xiuyu Sun Alibaba Group xiuyu.sxy@alibaba-inc.com Rong Jin Twitter rongjinemail@gmail.com
Pseudocode Yes The pseudo code for self-attention and Q-Attention is detailed in Section D of Appendix.
Open Source Code Yes Source codes are available at https://github.com/Thomas-wyh/B-Attention.
Open Datasets Yes Experiments are conducted on commonly used visual datasets4 with three different types of objects, i.e., MS-Celeb [21] (human faces), MSMT17 [59] (human bodies) and Ve Ri776 [37] (vehicles)
Dataset Splits No It is worth noting that MS-Celeb is divided into 10 parts by identities (Part0-9): Part0 for training and Part1-9 for testing, to maintain comparability and consistency with previous works [63, 63, 20, 57, 42, 50]. Details of the dataset are in Section A.1.
Hardware Specification Yes Experiments were conducted on the Alibaba Cloud platform equipped with GPU instances. Each GPU instance contains 8 NVIDIA A100-SXM4-40GB GPUs.
Software Dependencies No The paper mentions software components like 'GCN layers' and 'PRe LU activation' and strategies like 'cosine annealing strategy' but does not specify exact version numbers for programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes The output features have a dimension of 2048 for all datasets. The number of GCN layers is also tuned respectively for different datasets: MS-Celeb uses three GCN layers; MSMT17 and Ve Ri-776 use one GCN layer. Each node acts as the probe to search its k NN to construct the graph. The k value is tuned respectively for different datasets: 120 for MS-Celeb, 30 for MSMT17, and 80 for Ve Ri-776. The Hinge Loss [45, 57] is used to classify node pairs. The initial learning rate is 0.008 with the cosine annealing strategy.