Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights

Authors: Konstantin Schürholt, Boris Knyazev, Xavier Giró-i-Nieto, Damian Borth

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The models generated using our methods are diverse, performant and capable to outperform strong baselines as evaluated on several downstream tasks: initialization, ensemble sampling and transfer learning. Our results indicate the potential of knowledge aggregation from model zoos to new models via hyper-representations thereby paving the avenue for novel research directions.
Researcher Affiliation Collaboration Konstantin Schürholt konstantin.schuerholt@unisg.ch AIML Lab, School of Computer Science University of St.Gallen Boris Knyazev b.knyazev@samsung.com Samsung SAIT AI Lab, Montreal Xavier Giró-i-Nieto xavier.giro@upc.edu Institut de Robòtica i Informàtica Industrial Universitat Politècnica de Catalunya Damian Borth damian.borth@unisg.ch AIML Lab, School of Computer Science University of St.Gallen
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes The hyper-representations and code to reproduce our results are available at https://github.com/HSG-AIML/Neur IPS_2022-Generative_Hyper_Representations.
Open Datasets Yes We train and evaluate our approaches on four image classification datasets: MNIST [24], SVHN [32], CIFAR-10 [23], STL-10 [5]. ... We have trained new model zoos and will make them publicly available via an URL.
Dataset Splits Yes Each zoo is split in the train (70%), validation (15%) and test (15%) splits.
Hardware Specification No The paper states that "Full details on training, including infrastructure and compute is detailed in the Appendix B." and the checklist confirms "Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Appendix B." However, the provided text of the paper itself does not contain specific hardware details such as GPU models or CPU types.
Software Dependencies No The paper does not explicitly state specific software dependencies with version numbers in the provided text. It points to Appendix B for full details on training, which might contain this information, but it is not present in the main text provided.
Experiment Setup No The paper mentions that "All the details are specified in the Appendix A and B" and "all methods share training hyperparameters. Fine-tuning uses the hyperparameters of the target domain." However, it does not provide concrete hyperparameter values or detailed system-level training settings within the provided text.