reproducibilityindex.ai

Efficient Activation Function Optimization through Surrogate Modeling

Authors: Garrett Bingham, Risto Miikkulainen

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	First, the benchmark datasets Act-Bench-CNN, Act-Bench-Res Net, and Act-Bench-Vi T were created by training convolutional, residual, and vision transformer architectures from scratch with 2,913 systematically generated activation functions." and "In the third step, this surrogate was evaluated experimentally, first by verifying that it can discover known good functions in the benchmark datasets efficiently and reliably, and second by demonstrating that it can discover improved activation functions in new tasks involving different datasets, search spaces, and architectures.
Researcher Affiliation	Collaboration	Garrett Bingham The University of Texas at Austin and Cognizant AI Labs San Francisco, CA 94105 garrett@gjb.ai" and "Risto Miikkulainen The University of Texas at Austin and Cognizant AI Labs San Francisco, CA 94105 risto@cs.utexas.edu
Pseudocode	No	No pseudocode or algorithm blocks are provided in the paper.
Open Source Code	Yes	AQua Sur F code is available at https://github.com/cognizant-ai-labs/aquasurf
Open Datasets	Yes	First, the benchmark datasets Act-Bench-CNN, Act-Bench-Res Net, and Act-Bench-Vi T were created by training convolutional, residual, and vision transformer architectures from scratch with 2,913 systematically generated activation functions." and "The benchmark collections are made available at https://github.com/cognizant-ai-labs/act-bench" and "All-CNN-C on CIFAR-10, Res Net-56 on CIFAR-10, and Mobile Vi Tv2-0.5 on Imagenette [22, 24, 31, 41, 51].
Dataset Splits	Yes	For CIFAR-10 and CIFAR-100, balanced validation sets were created by sampling 5,000 images from the training set." and "Full training details and hyperparameters are listed in Tables 5 and 6.
Hardware Specification	Yes	The experiments in this paper were implemented using an AWS g5.48xlarge instance with eight NVIDIA A10G GPUs.
Software Dependencies	No	The algorithms were used out of the box with default hyperparameters from the scikit-learn package [47].
Experiment Setup	Yes	Full training details and hyperparameters are listed in Tables 5 and 6.