A View From Somewhere: Human-Centric Face Representations

Authors: Jerone Theodore Alexander Andrews, Przemyslaw Joniak, Alice Xiang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To address these issues, we present A View From Somewhere (AVFS) a dataset of 638,180 human judgments of face similarity.1 We demonstrate the utility of AVFS for learning a continuous, low-dimensional embedding space aligned with human perception. Our embedding space, induced under a novel conditional framework, not only enables the accurate prediction of face similarity, but also provides a human-interpretable decomposition of the dimensions used in the human-decision making process, and the importance distinct annotators place on each dimension.
Researcher Affiliation Collaboration Jerone T. A. Andrews Sony AI, Tokyo Przemysław Joniak University of Tokyo, Tokyo Alice Xiang Sony AI, New York
Pseudocode No The paper describes the model mathematically and provides an objective function, but it does not include any pseudocode or algorithm blocks.
Open Source Code Yes Code and data may be found at https://github.com/Sony AI/a_view_from_somewhere.
Open Datasets Yes AVFS contains 638,180 quality-controlled triplets over 4,921 faces... The dataset does not include any images, it instead references to image identifiers. The image identifiers can be used to obtain the relevant FFHQ images, which are hosted on NVIDIA Corporation s Google Drive: https://github.com/NVlabs/ffhq-dataset.
Dataset Splits Yes reserving 10% of AVFS for validation.
Hardware Specification Yes Training was performed using a batch size of 128 on a single Tesla T4 GPU.
Software Dependencies No The paper mentions software components like 'Res Net18' (architecture), 'Adam' (optimizer), and 'dlib face detector', but it does not provide specific version numbers for these software dependencies (e.g., 'PyTorch 1.9', 'dlib 19.18').
Experiment Setup Yes AVFS models have Res Net18 (He et al., 2016) architectures and output 128-dimensional embeddings. We use the Adam (Kingma & Ba, 2014) optimizer with default parameters, reserving 10% of AVFS for validation. Based on grid search, we empirically set α1 = 0.00005 and α2 = 0.01. For AVFS-C and AVFS-CPH, we additionally set α3 = 0.00001.