Sampling weights of deep neural networks
Authors: Erik L Bolager, Iryna Burak, Chinmay Datar, Qing Sun, Felix Dietrich
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In numerical experiments, we demonstrate that sampled networks achieve accuracy comparable to iteratively trained ones, but can be constructed orders of magnitude faster. Our test cases involve a classification benchmark from Open ML, sampling of neural operators to represent maps in function spaces, and transfer learning using well-known architectures. |
| Researcher Affiliation | Academia | Erik Lien Bolager+ Iryna Burak+ Chinmay Datar+ Qing Sun+ Felix Dietrich+ Technical University of Munich +School of Computation, Information and Technology; Institute for Advanced Study |
| Pseudocode | Yes | Algorithm 1: The SWIM algorithm, for activation function ϕ, and norms on input, output of the hidden layers, and output space, X0, Xl, and Y respectively. Also, L is a loss function, which in our case is always L2 loss, and arg min L( , ) becomes a linear optimization problem. |
| Open Source Code | Yes | The code to reproduce the experiments from the paper, and an up-to-date code base, can be found at https://gitlab.com/felix.dietrich/swimnetworks-paper, https://gitlab.com/felix.dietrich/swimnetworks. |
| Open Datasets | Yes | We use the Open ML-CC18 Curated Classification benchmark [4] with all its 72 tasks to compare our sampling method to the Adam optimizer [36]. ... https://openml.org/search?type=benchmark&sort=tasks_included&study_type=task&id=99. We choose the CIFAR-10 dataset [39], with 50000 training and 10000 test images. |
| Dataset Splits | Yes | We generate 15000 pairs of (u(x, 0), u(x, 1)), and split them into the train (60%), validation (20%), and test sets (20%). |
| Hardware Specification | Yes | Our implementation is based on the numpy and scipy Python libraries, and we run all experiments on a machine with 32GB system RAM (256GB in Section 4.3 and Section 4.4) and a Ge Force 4x RTX 3080 Turbo GPU with 10GB RAM. The Adam training was done on the Ge Force 4x RTX 3080 Turbo GPU with 10 GB RAM, while sampling was performed on the Intel Core i7-7700 CPU @ 3.60GHz 8 with 32GB RAM. ... sampling was performed on the 1x AMD EPYC 7402 @ 2.80GHz 8 with 256GB RAM. |
| Software Dependencies | No | Our implementation is based on the numpy and scipy Python libraries... We use the sacred package... We use the deepxde package... We use the neuralop package... We use Py Torch framework... We use the Tensor Flow framework... Python 3.8 is required to run the computational experiments and install the software. |
| Experiment Setup | Yes | Table 3: Network hyperparameters used in the Open ML benchmark study. Table 4: Network hyperparameters used to train deep neural operators. We use a learning rate of 10 3, batch size of 32, and train for 20 epochs with early-stopping patience of 10 epochs. We store the parameters that yield the lowest loss on test data. |