Sampling weights of deep neural networks

Authors: Erik L Bolager, Iryna Burak, Chinmay Datar, Qing Sun, Felix Dietrich

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In numerical experiments, we demonstrate that sampled networks achieve accuracy comparable to iteratively trained ones, but can be constructed orders of magnitude faster. Our test cases involve a classification benchmark from Open ML, sampling of neural operators to represent maps in function spaces, and transfer learning using well-known architectures.
Researcher Affiliation Academia Erik Lien Bolager+ Iryna Burak+ Chinmay Datar+ Qing Sun+ Felix Dietrich+ Technical University of Munich +School of Computation, Information and Technology; Institute for Advanced Study
Pseudocode Yes Algorithm 1: The SWIM algorithm, for activation function ϕ, and norms on input, output of the hidden layers, and output space, X0, Xl, and Y respectively. Also, L is a loss function, which in our case is always L2 loss, and arg min L( , ) becomes a linear optimization problem.
Open Source Code Yes The code to reproduce the experiments from the paper, and an up-to-date code base, can be found at https://gitlab.com/felix.dietrich/swimnetworks-paper, https://gitlab.com/felix.dietrich/swimnetworks.
Open Datasets Yes We use the Open ML-CC18 Curated Classification benchmark [4] with all its 72 tasks to compare our sampling method to the Adam optimizer [36]. ... https://openml.org/search?type=benchmark&sort=tasks_included&study_type=task&id=99. We choose the CIFAR-10 dataset [39], with 50000 training and 10000 test images.
Dataset Splits Yes We generate 15000 pairs of (u(x, 0), u(x, 1)), and split them into the train (60%), validation (20%), and test sets (20%).
Hardware Specification Yes Our implementation is based on the numpy and scipy Python libraries, and we run all experiments on a machine with 32GB system RAM (256GB in Section 4.3 and Section 4.4) and a Ge Force 4x RTX 3080 Turbo GPU with 10GB RAM. The Adam training was done on the Ge Force 4x RTX 3080 Turbo GPU with 10 GB RAM, while sampling was performed on the Intel Core i7-7700 CPU @ 3.60GHz 8 with 32GB RAM. ... sampling was performed on the 1x AMD EPYC 7402 @ 2.80GHz 8 with 256GB RAM.
Software Dependencies No Our implementation is based on the numpy and scipy Python libraries... We use the sacred package... We use the deepxde package... We use the neuralop package... We use Py Torch framework... We use the Tensor Flow framework... Python 3.8 is required to run the computational experiments and install the software.
Experiment Setup Yes Table 3: Network hyperparameters used in the Open ML benchmark study. Table 4: Network hyperparameters used to train deep neural operators. We use a learning rate of 10 3, batch size of 32, and train for 20 epochs with early-stopping patience of 10 epochs. We store the parameters that yield the lowest loss on test data.