Agnostic Learning with Multiple Objectives

Authors: Corinna Cortes, Mehryar Mohri, Javier Gonzalvo, Dmitry Storcheus

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We further implement the algorithm in a popular symbolic gradient computation framework and empirically demonstrate on a number of datasets the benefits of ALMO framework versus learning with a fixed mixture weights distribution.
Researcher Affiliation Collaboration Corinna Cortes Google Research New York, NY 10011 corinna@google.com; Javier Gonzalvo Google Research New York, NY 10011 xavigonzalvo@google.com; Mehryar Mohri Google & Courant Institute New York, NY 10012 mohri@google.com; Dmitry Storcheus Courant Institute & Google New York, NY 10012 dstorcheus@google.com
Pseudocode Yes The pseudocode for the ALMO optimization algorithm is given in Algorithm 1.
Open Source Code No The paper states that the algorithm is implemented in TensorFlow, Keras, and PyTorch, but does not provide a specific link or explicit statement of release for its own source code.
Open Datasets Yes We conducted a series of experiments on several datasets that demonstrate the benefits of the ALMO framework versus learning with a uniform mixture weights distribution. The models are compared on MNIST [Le Cun and Cortes, 2010], Fashion-MNIST [Xiao et al., 2017] and ADULT [Dua and Graff, 2017] datasets with standard feature preprocessing techniques applied.
Dataset Splits Yes For both models, we run hyper-parameter tuning with a parameter grid size 50 on a validation set, which is 20% of the training data.
Hardware Specification No The paper does not specify the hardware used for experiments, such as specific GPU or CPU models.
Software Dependencies No The paper mentions popular symbolic gradient computation platforms like TENSORFLOW [Abadi et al., 2016], KERAS [Chollet et al., 2015], or PYTORCH [Paszke et al., 2019], but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes For both models, we run hyper-parameter tuning with a parameter grid size 50 on a validation set, which is 20% of the training data. [...] We report results for two model architectures: a logistic regression and a neural network with dimensions 1024-512-128.