Learning Label Encodings for Deep Regression

Authors: Deval Shah, Tor M. Aamodt

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluation demonstrates that RLEL can be combined with off-the-shelf feature extractors and is suitable across different architectures, datasets, and tasks. Code is available at https://github.com/ubc-aamodt-group/RLEL_regression. We evaluate the proposed approach on 11 benchmarks, covering diverse datasets, network architectures, and regression tasks, such as head pose estimation, facial landmark detection, age estimation, and autonomous driving. Label encodings found by RLEL result in lower or comparable errors to manually designed label encodings.
Researcher Affiliation Academia Deval Shah & Tor M. Aamodt Department of Electrical and Computer Engineering University of British Columbia, Vancouver, BC, Canada {devalshah,aamodt}@ece.ubc.ca
Pseudocode Yes Algorithm 1 Simulated annealing for encodings design Input: Kmax, T, M, N; Output: C {0, 1}M N;
Open Source Code Yes Code is available at https://github.com/ubc-aamodt-group/RLEL_regression.
Open Datasets Yes Landmark-free 2D head pose estimation (LFH) takes a 2D image as input and directly finds the pose of a human head with three angles: yaw, pitch, and roll. We use the Res Net50 network as the feature extractor. This network is initialized using pre-trained parameters for Image Net (Russakovsky et al., 2015) dataset. During the training for RLEL the entire network, including the feature extractor, is trained.
Dataset Splits Yes The training dataset is divided into 70% training and 30% validation sets for tuning hyperparameters. The network is trained using the full dataset after hyperparameter tuning.
Hardware Specification Yes We report the training time using an NVIDIA RTX 2080 Ti GPU with 11GB of memory for each benchmark.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup Yes Table 12: Training parameters for LFH1. Table 15: Training parameters for facial landmark detection for HRNet V2-W18 feature extractor. Table 19: Training parameters for age estimation using MORPH-II and AFAD dataset. Table 22: Training parameters for end-to-end autonomous driving using Pilot Net.