Input Convex Neural Networks

Authors: Brandon Amos, Lei Xu, J. Zico Kolter

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we highlight the performance of the methods on multi-label prediction, image completion, and reinforcement learning problems, where we show improvement over the existing state of the art in many cases.
Researcher Affiliation Academia 1School of Computer Science, Carnegie Mellon University. Pittsburgh, PA, USA 2Department of Computer Science and Technology, Tsinghua University. Beijing, China.
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes The full source code for all experiments is available in the icml2017 branch at https://github.com/ locuslab/icnn and our implementation is built using Python (Van Rossum & Drake Jr, 1995) with the numpy (Oliphant, 2006) and Tensor Flow (Abadi et al., 2016) packages.
Open Datasets Yes multi-label classification on the Bib Te X dataset (Katakis et al., 2008), image completion using the Olivetti face dataset (Samaria & Harter, 1994), and continuous action reinforcement learning in the Open AI Gym (Brockman et al., 2016) that use the Mu Jo Co physics simulator (Todorov et al., 2012).
Dataset Splits No The paper specifies training and test splits, for example, "We use the train/test split of 4880/2515 from (Katakis et al., 2008)", but does not explicitly mention a separate validation split or dataset.
Hardware Specification No The paper mentions using the MuJoCo physics simulator but does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper states that the implementation uses "Python... with the numpy... and Tensor Flow... packages." However, it does not provide specific version numbers for these software dependencies, only citations to their original papers.
Experiment Setup Yes We optimize our PICNN with 30 iterations of gradient descent with a learning rate of 0.1 and a momentum of 0.3. (...) We use a learning rate of 0.01 and momentum of 0.9 with gradient descent for the inner optimization in the ICNN. (...) All of our experiments use a PICNN with two fully-connected layers that each have 200 hidden units.