ProxyFusion: Face Feature Aggregation Through Sparse Experts
Authors: Bhavin Jawade, Alexander Stone, Deen Dayal Mohan, Xiao Wang, Srirangaraj Setlur, Venu Govindaraju
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through qualitative experiments, we demonstrate that Proxy Fusion learns discriminative information for importance weighting of face features without relying on intermediate features. Quantitative evaluations on challenging low-resolution face verification datasets such as IARPA BTS3.1 and Drone SURF show the superiority of Proxy Fusion in unconstrained long-range face recognition setting. |
| Researcher Affiliation | Academia | Bhavin Jawade University at Buffalo bhavinja@buffalo.edu Alexander Stone University at Buffalo awstone@buffalo.edu Deen Dayal Mohan University at Buffalo dmohan@buffalo.edu Xiao Wang University at Buffalo xwang277@buffalo.edu Srirangaraj Setlur University at Buffalo setlur@buffalo.edu Venu Govindaraju University at Buffalo govind@buffalo.edu |
| Pseudocode | No | The paper describes the method's steps in text and equations but does not include a formal 'Pseudocode' or 'Algorithm' block or figure. |
| Open Source Code | Yes | Our code and pretrained models are available at: https://github.com/ bhavinjawade/Proxy Fusion |
| Open Datasets | Yes | Our experiments utilize the following datasets for training: (i) BRIAR Research Set 3 (BRS 3)[1]: This dataset is from IARPA s BRIAR program Phase 1... (ii) Web Face 4M [22]: Apart from BRIAR, we also present results by training our method Web Face 4M dataset... (iii) BTS 3.1: This is the test set for IARPA BRIAR Phase 1 evaluation... (iv) Drone SURF [6]: This dataset includes Active and Passive Surveillance settings... |
| Dataset Splits | Yes | Following CAFace [9], we use their randomly sampled subset, consisting of 813, 482 images from 10, 000 identities to train our aggregation function. ...Following [12], we split subjects randomly: 60% (34 identities) for training/validation, 40% (24 identities) for testing. ...We choose the bounds for probe and gallery subset sizes to be L = 100 and U = 1200. |
| Hardware Specification | Yes | All experiments are performed on 1 x A6000 48GB NVIDIA GPU. |
| Software Dependencies | No | The paper mentions using Adafactor as the optimizer, MTCNN and Retina Face for detection, and Adaface/Arcface for feature extraction, but does not specify version numbers for these software components or other key libraries (e.g., Python, PyTorch/TensorFlow). |
| Experiment Setup | Yes | For the expert networks we utilize a three layer MLP with Leaky Re LU activation and a dropout with probability of 0.5. More specifically, the first layer in the MLP projects from 1536 to 1024, the second layer from 1024 to 1024, and last layer from 1024 to 512. Overall the MLP has 3.14M learnable parameters. For supervised contrastive loss, we use a temperature of 0.1. The proxy loss weight γ = 0.01 and threshold λ = 0.1. We utilize the Adafactor with adaptive learning rates as the optimizer. For all So TA experiments we utilize number of total proxies K as 11, and number of selected experts b K as 4. We choose the number of identities in a batch M = 170 based on available GPU memory. We choose the bounds for probe and gallery subset sizes to be L = 100 and U = 1200. |