Set-based Neural Network Encoding Without Weight Tying
Authors: Bruno Andreis, Bedionita Soro, Philip Torr, Sung Ju Hwang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experimental results on INRs, and the standard CNN benchmark model zoos used in Unterthiner et al. [2020],Zhou et al. [2023a],Zhou et al. [2023b], and Navon et al. [2023]. Experimental settings, hyperparameters, model specification, ablation of SNE and discussions on applying SNE to architectures with branches (e.g. Res Nets) in Appendix D. |
| Researcher Affiliation | Collaboration | Bruno Andreis1, Soro Bedionita1, Philip H.S. Torr2, Sung Ju Hwang1,3 KAIST 1, South Korea University of Oxford, United Kingdom 2 Deep Auto.ai, South Korea 3 |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks labeled as such. |
| Open Source Code | No | Additionally, a reference implementation will be made publicly available. |
| Open Datasets | Yes | We utilize the model zoo of Navon et al. [2023] consisting of INRs [Sitzmann et al., 2020]... We utilize model zoos trained on MNIST, CIFAR10 and SVHN datasets. We generate a model zoo for these dataset with an architecture consisting of 3 convolutional layers followed by two linear layers and term the resulting model zoo Arch1. Exact architectural specifications are detailed in Appendix G. We generate the model zoos of Arch2 following the routine described in Appendix A.2 of Unterthiner et al. [2020]. We refer to the model zoos of Unterthiner et al. [2020] as Arch2. All model zoos of Arch1 are used for training and those of Arch2 are used for testing and are not seen during training. |
| Dataset Splits | Yes | Each model zoo is split into a training, testing and validation splits. ... Dataset splits for model zoos of Arch1 is given in Table 13. |
| Hardware Specification | Yes | All experiments are performed with a single Ge Force GTX 1080 TI GPU with 11GB of memory. |
| Software Dependencies | Yes | SNE is implemented using Pytorch [Paszke et al., 2019]. |
| Experiment Setup | Yes | We elaborate all the hyperparameters used for all experiments in Table 14. ... LR 1e 4 Optimizer Adam Scheduler Multistep Batchsize 64 Epochs 300 Metric Binary Cross Entropy SAB Hidden Size 64 PMA Seed Size 64 # SAB Blocks 2 chunksize 32 SAB Layer Norm False |