Graph Structure of Neural Networks
Authors: Jiaxuan You, Jure Leskovec, Kaiming He, Saining Xie
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here we systematically investigate how does the graph structure of neural networks affect their predictive performance. Using standard image classification datasets CIFAR-10 and Image Net, we conduct a systematic study on how the architecture of neural networks affects their predictive performance. We make several important empirical observations: A sweet spot of relational graphs lead to neural networks with significantly improved performance; |
| Researcher Affiliation | Collaboration | Jiaxuan You 1 Jure Leskovec 1 Kaiming He 2 Saining Xie 2 1Department of Computer Science, Stanford University 2Facebook AI Research. Correspondence to: Jiaxuan You <jiaxuan@cs.stanford.edu>, Saining Xie <s9xie@fb.com>. |
| Pseudocode | No | The paper describes methods mathematically and textually but does not include any labeled "Pseudocode" or "Algorithm" blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | Using standard image classification datasets CIFAR-10 and Image Net... CIFAR-10 dataset (Krizhevsky, 2009)... Image Net classification (Russakovsky et al., 2015)... |
| Dataset Splits | Yes | CIFAR-10 dataset (Krizhevsky, 2009) which has 50K training images and 10K validation images... For Image Net experiments... 1.28M training images and 50K validation images. |
| Hardware Specification | Yes | Training an MLP model roughly takes 5 minutes on a NVIDIA Tesla V100 GPU, and training a Res Net model on Image Net roughly takes a day on 8 Tesla V100 GPUs with data parallelism. |
| Software Dependencies | No | The paper mentions using a 'cosine learning rate schedule' and 'Batch Norm layer' but does not specify software dependencies with version numbers (e.g., PyTorch, TensorFlow versions or specific library versions). |
| Experiment Setup | Yes | We train the model for 200 epochs with batch size 128, using cosine learning rate schedule... with an initial learning rate of 0.1... We train all MLP models with 5 different random seeds... For Image Net experiments... 100 epochs using cosine learning rate schedule with initial learning rate of 0.1. Batch size is 256 for Res Net-family models and 512 for Efficient Net-B0. |