NeRN: Learning Neural Representations for Neural Networks

Authors: Maor Ashkenazi, Zohar Rimon, Ron Vainshtein, Shir Levi, Elad Richardson, Pinchas Mintz, Eran Treister

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we evaluate our proposed method on three standard vision classification benchmarks CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009) and Image Net (Deng et al., 2009). For all benchmarks, we use Ne RN to predict the weights of the Res Net (He et al., 2015a) architectures.
Researcher Affiliation Collaboration Maor Ashkenazi1 , Zohar Rimon2, Ron Vainshtein2, Shir Levi3, Elad Richardson3, Pinchas Mintz3, Eran Treister1 1Ben-Gurion University of the Negev 2Technion Israel Institute of Technology 3Penta-AI
Pseudocode No The paper describes the Ne RN pipeline and training details in text and with a diagram (Figure 1), but does not provide any pseudocode or algorithm blocks.
Open Source Code Yes Email: maorash@post.bgu.ac.il. Code avaliable at: https://github.com/maorash/Ne RN.
Open Datasets Yes In this section we evaluate our proposed method on three standard vision classification benchmarks CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009) and Image Net (Deng et al., 2009).
Dataset Splits No The paper uses standard benchmarks like CIFAR-10, CIFAR-100, and ImageNet, which have predefined splits, but it does not explicitly state the specific training, validation, and test split percentages or sample counts used in the paper.
Hardware Specification Yes We run our experiments using Py Torch on a single Nvidia RTX3090.
Software Dependencies No The paper mentions using 'Py Torch' and the 'Ranger' optimizer, but it does not specify specific version numbers for these or any other software dependencies.
Experiment Setup Yes We adopt the Ranger (Wright, 2019) optimizer, using a learning rate of 5 10 3 and a cosine learning rate decay. The input coordinates are mapped to positional embeddings of size 240. ... We train Ne RN for 70k iterations, using a task input batch size of 256. In addition, a batch of 212 reconstructed weights for the gradient computation is sampled in each iteration.