Implicit rate-constrained optimization of non-decomposable objectives

Authors: Abhishek Kumar, Harikrishna Narasimhan, Andrew Cotter

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on benchmark datasets demonstrate the effectiveness of our proposed method over existing state-of-the-art approaches for these problems.
Researcher Affiliation Industry Abhishek Kumar 1 Harikrishna Narasimhan 1 Andrew Cotter 1 1Google Research. Correspondence to: Abhishek Kumar <abhishk@google.com>.
Pseudocode Yes Algorithm 1 Implicit Constrained Optimization (ICO)
Open Source Code No The paper refers to an open-sourced library (TFCO) used as a baseline (https://github.com/google-research/ tensorflow_constrained_optimization), but does not state that the code for their proposed method (ICO) is open-sourced or provide a link to it.
Open Datasets Yes We experiment with the publicly available Celeb A dataset (Liu et al., 2015), Big Earth Net (Sumbul et al., 2019) image datasets. For the Letter dataset... obtained from the UCI repository (Frank & Asuncion, 2010).
Dataset Splits Yes We use the standard train, validation, and test splits3 for Celeb A... We split the dataset randomly into 70% for training, 15% for validation and 15% testing... We split the datasets into train, validation and test datasets in the ratios 50%:25%:25%.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions using 'Adam optimizer' and 'TensorFlow Constrained Optimization Library (TFCO)' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes For our experiments with Image datasets, we use a 6-layer neural network with 5 convolutional layers with 128, 256, 256, 512 and 512 filters respectively. We use Re LU activation functions and batch normalization layers in the network. For the cross-entropy baseline, we use Adam optimizer (Kingma & Ba, 2014) with a learning rate of 0.001... All optimizers use a batch size of 512.