reproducibilityindex.ai

Addressing the Loss-Metric Mismatch with Adaptive Loss Alignment

Authors: Chen Huang, Shuangfei Zhai, Walter Talbott, Miguel Bautista Martin, Shih-Yu Sun, Carlos Guestrin, Josh Susskind

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically show how this formulation improves performance by simultaneously optimizing the evaluation metric and smoothing the loss landscape. We verify our method in metric learning and classiﬁcation scenarios, showing considerable improvements over the state-of-the-art on a diverse set of tasks.
Researcher Affiliation	Industry	1Apple Inc., Cupertino, United States. Correspondence to: Chen Huang <chen-huang@apple.com>.
Pseudocode	Yes	Algorithm 1 Reinforcement Learning for ALA
Open Source Code	No	The paper does not provide any specific statements about releasing source code, nor does it include links to a code repository.
Open Datasets	Yes	We experiment on CIFAR-10 (Krizhevsky, 2009) with 50k images for training and 10k images for testing. ... The SOP dataset (Song et al., 2016), and face recognition (FR) experiments on the LFW dataset (Huang et al., 2007). ... Image Net (Deng et al., 2009) classiﬁer
Dataset Splits	Yes	For training a loss controller, we divide the training set randomly into a new training set of 40k images and a validation set of 10k images. ... The SOP dataset contains 120,053 images of 22,634 categories. The ﬁrst 10,000 and 1,318 categories are used for training and validation, and the remaining are used for testing.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments, only general optimization and architecture details.
Software Dependencies	No	The paper mentions optimizers like 'Momentum-SGD' and 'RMSProp optimizer' but does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their specific versions) required for replication.
Experiment Setup	Yes	The ALA controller is instantiated as an MLP consisting of 2 hidden layers each with 32 Re LU units. Our state st includes a sequence of validation statistics observed from past 10 time-steps. We use a learning rate of 0.001 for policy learning. Training episodes are collected from all child networks every K = 200 gradient descent iterations. We set the discount factor γ = 0.9 (Equation 4), loss parameter updating step β = 0.1 and distance offset α = 1 (Equation 10).