Ada-NETS: Face Clustering via Adaptive Neighbour Discovery in the Structure Space
Authors: Yaohua Wang, Yaobin Zhang, Fangyi Zhang, Senzhang Wang, Ming Lin, YuQi Zhang, Xiuyu Sun
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on multiple public clustering datasets show that Ada-NETS significantly outperforms current state-of-the-art methods, proving its superiority and generalization. |
| Researcher Affiliation | Collaboration | Yaohua Wang , Yaobin Zhang *, Fangyi Zhang, Ming Lin, Yu Qi Zhang Alibaba Group {xiachen.wyh, zhangyaobin.zyb, zhiyuan.zfy, ming.l, gongyou.zyq}@alibaba-inc.com. Senzhang Wang Central South University szwang@csu.edu.cn. Xiuyu Sun Alibaba Group xiuyu.sxy@alibaba-inc.com |
| Pseudocode | No | The paper describes the algorithm using prose and mathematical equations but does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | Code is available at https://github.com/damo-cv/Ada-NETS. |
| Open Datasets | Yes | Three datasets are used in the experiments: MS-Celeb-1M (Guo et al., 2016; Deng et al., 2019)... The clothes dataset Deep Fashion (Liu et al., 2016)... MSMT17 (Wei et al., 2018)... For a fair comparison, we follow the same protocol and features as VE-GCN (Yang et al., 2019) to divide the dataset evenly into ten parts by identities, and use part 0 as the training set and part 1 to part 9 as the testing set. |
| Dataset Splits | No | The paper specifies a training set ('part 0 as the training set') and a testing set ('part 1 to part 9 as the testing set') but does not explicitly mention a separate validation set split. |
| Hardware Specification | Yes | Empirically, Ada-NETS takes about 18.9 minutes to cluster part 1 test set (about 584k samples) on 54 E5-2682 v4 CPUs and 8 NVIDIA P100 cards. |
| Software Dependencies | No | The experiments are conducted with Py Torch (Paszke et al., 2019) and DGL (Wang et al., 2019a). The paper mentions the software used but does not specify their version numbers. |
| Experiment Setup | Yes | The learning rate is initially 0.01 for training the adaptive filter and 0.1 for training the GCN with cosine annealing. δ = 1 for Huber loss, β1 = 0.9, β2 = 1.0, λ = 1 for Hingle loss and β = 0.5 for Q-value. The SGD optimizer with momentum 0.9 and weight decay 1e-5 is used. k is set 80, 5, 40 on MS-Celeb-1M, Deep Fashion, and MSMT17. |