Implicit rate-constrained optimization of non-decomposable objectives
Authors: Abhishek Kumar, Harikrishna Narasimhan, Andrew Cotter
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on benchmark datasets demonstrate the effectiveness of our proposed method over existing state-of-the-art approaches for these problems. |
| Researcher Affiliation | Industry | Abhishek Kumar 1 Harikrishna Narasimhan 1 Andrew Cotter 1 1Google Research. Correspondence to: Abhishek Kumar <abhishk@google.com>. |
| Pseudocode | Yes | Algorithm 1 Implicit Constrained Optimization (ICO) |
| Open Source Code | No | The paper refers to an open-sourced library (TFCO) used as a baseline (https://github.com/google-research/ tensorflow_constrained_optimization), but does not state that the code for their proposed method (ICO) is open-sourced or provide a link to it. |
| Open Datasets | Yes | We experiment with the publicly available Celeb A dataset (Liu et al., 2015), Big Earth Net (Sumbul et al., 2019) image datasets. For the Letter dataset... obtained from the UCI repository (Frank & Asuncion, 2010). |
| Dataset Splits | Yes | We use the standard train, validation, and test splits3 for Celeb A... We split the dataset randomly into 70% for training, 15% for validation and 15% testing... We split the datasets into train, validation and test datasets in the ratios 50%:25%:25%. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer' and 'TensorFlow Constrained Optimization Library (TFCO)' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For our experiments with Image datasets, we use a 6-layer neural network with 5 convolutional layers with 128, 256, 256, 512 and 512 filters respectively. We use Re LU activation functions and batch normalization layers in the network. For the cross-entropy baseline, we use Adam optimizer (Kingma & Ba, 2014) with a learning rate of 0.001... All optimizers use a batch size of 512. |