CRONOS: Enhancing Deep Learning with Scalable GPU Accelerated Convex Neural Networks
Authors: Miria Feng, Zachary Frangella, Mert Pilanci
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the efficacy of CRONOS and CRONOS-AM through extensive large-scale numerical experiments with GPU acceleration in JAX. Our results show that CRONOS-AM can obtain comparable or better validation accuracy than predominant tuned deep learning optimizers on vision and language tasks with benchmark datasets such as Image Net and IMDb. |
| Researcher Affiliation | Academia | Miria Feng Electrical Engineering Stanford University miria0@stanford.edu Zachary Frangella Management Science & Engineering Stanford University zfran@stanford.edu Mert Pilanci Electrical Engineering Stanford University pilanci@stanford.edu |
| Pseudocode | Yes | Algorithm 1 ADMM for Convex Re LU Networks |
| Open Source Code | Yes | Our codebase is available at https://github.com/pilancilab/CRONOS |
| Open Datasets | Yes | Our results show that CRONOS-AM can obtain comparable or better validation accuracy than predominant tuned deep learning optimizers on vision and language tasks with benchmark datasets such as Image Net and IMDb. |
| Dataset Splits | No | The paper mentions 'validation accuracy' and plots results on validation sets (e.g., 'Validation Accuracy' in Figure 1). However, it specifies only training and testing splits (e.g., for CIFAR-10: 'The dataset is divided into 50,000 training images and 10,000 test images' and for IMDb: 'It is evenly split into 25,000 reviews for training and 25,000 reviews for testing'), but does not explicitly detail the size or methodology for a separate validation split. |
| Hardware Specification | Yes | All experiments were performed on an RTX-4090 GPU with 24 GB of memory and 100t FLOPS in JAX functional code. We utilize x86-64 CPU architecture with Ubuntu 22.04 OS. |
| Software Dependencies | Yes | All experiments were run in JAX v0.4.28 and FLAX v0.8.2. |
| Experiment Setup | Yes | CRONOS (including when used as a subproblem solver in CRONOS-AM is run for 5 ADMM iterations. The number of PCG iterations varies from 5-50 depending upon the task. The rank of the Nyström preconditioner varies from r = 10 to r = 20. The value of ρ is varied from 0.001 to 1 depending upon the task. For CRONOS-AM DAdapted-Adam W is always run for 1 epoch to get the non-convex weights. |