Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Merge or Not? Learning to Group Faces via Imitation Learning
Authors: Yue He, Kaidi Cao, Cheng Li, Chen Loy
AAAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three benchmark datasets show that our framework outperforms unsupervised and supervised baselines. |
| Researcher Affiliation | Collaboration | 1Sense Time Group Limited, EMAIL 2The Chinese University of Hong Kong, EMAIL |
| Pseudocode | Yes | Algorithm 1: Reward function learning via IRL. |
| Open Source Code | Yes | Our codes and data are released1 to fa- 1https://github.com/bj80heyue/Learning-to-Group cilitate future studies. |
| Open Datasets | Yes | We employ 2, 000 albums simulated from MS-Celeb1M (Guo et al. 2016) of 80k identities as our training source and generalize it to various test data below. ... LFW (Huang et al. 2007), MS-Celeb-1M (Guo et al. 2016), and PFW (Sengupta et al. 2016) ... ACCIO dataset (Ghaleb et al. 2015) ... Our codes and data are released1 to fa- 1https://github.com/bj80heyue/Learning-to-Group cilitate future studies. |
| Dataset Splits | No | The paper states using "2,000 albums simulated from MS-Celeb1M" as a training source and evaluating on "LFW-Album", "ACCIO Dataset", and "Grouping Face in the Wild (GFW)". While it specifies quantities for some datasets (e.g., "20 albums" for LFW-Album, "3243 tracklets" for ACCIO), it does not provide explicit numerical train/validation/test splits (e.g., percentages or sample counts for each split) within these datasets needed for reproducibility of data partitioning. |
| Hardware Specification | No | No specific hardware details (such as GPU/CPU models, memory, or processor types) used for running the experiments were mentioned in the paper. |
| Software Dependencies | No | The paper mentions models and algorithms such as Inception-v3, SVM, random forest regressor, and Faster-RCNN, but it does not provide specific version numbers for these or any other software dependencies (e.g., Python version, library versions) that would be needed for replication. |
| Experiment Setup | Yes | We set β = 0.8 in Eqn. (3) to balance the scales of shortand long-term rewards. We fixed the number of faces η = 5 to form the similarity and quality features. ... Specifically, we set γ = 0 in Eqn. (2) and β = 0 in Eqn. (3). ... Specifically, we set γ = 0.9 in Eqn. (2) and β = 0.8 in Eqn. (3). ... The three layers of the Siamese network have 256, 64, 64 hidden neurons, respectively. A contrastive loss is used for training. |