reproducibilityindex.ai

Graph HyperNetworks for Neural Architecture Search

Authors: Chris Zhang, Mengye Ren, Raquel Urtasun

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we use our proposed GHN to search for the best CNN architecture for image classiﬁcation. First, we evaluate the GHN on the standard CIFAR (Krizhevsky & Hinton, 2009) and Image Net (Russakovsky et al., 2015) architecture search benchmarks. Next, we apply GHN on an anytime prediction task where we optimize the speed-accuracy tradeoff that is key for many real-time applications. Finally, we benchmark the GHN s predicted-performance correlation and explore various factors in an ablation study.
Researcher Affiliation	Collaboration	Chris Zhang1,2, Mengye Ren1,3 & Raquel Urtasun1,3 1Uber Advanced Technologies Group, 2University of Waterloo, 3University of Toronto
Pseudocode	No	The paper describes algorithms and mathematical formulations but does not include a dedicated pseudocode block or a clearly labeled algorithm.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code or provide a link to a code repository for the described methodology.
Open Datasets	Yes	We conduct our initial set of experiments on CIFAR-10 (Krizhevsky & Hinton, 2009), which contains 10 object classes and 50,000 training images and 10,000 test images of size 32 32 3. We also run our GHN algorithm on the Image Net dataset (Russakovsky et al., 2015), which contains 1.28 million training images.
Dataset Splits	Yes	We conduct our initial set of experiments on CIFAR-10 (Krizhevsky & Hinton, 2009), which contains 10 object classes and 50,000 training images and 10,000 test images of size 32 32 3. We use 5,000 images split from the training set as our validation set.
Hardware Specification	No	The paper mentions 'distributed training across 32 GPUs' but does not specify the exact models of these GPUs (e.g., NVIDIA A100, Tesla V100) or other specific hardware components.
Software Dependencies	No	The paper mentions using 'ADAM optimizer' and 'GRU cell' but does not provide specific version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used.
Experiment Setup	Yes	For the GNN module, we use a standard GRU cell (Cho et al., 2014) with hidden size 32 and 2 layer MLP with hidden size 32 as the recurrent cell function U and message function M respectively. The shared hypernetwork H ( ; ϕ) is a 2-layer MLP with hidden size 64. From the results of ablations studies in Section 5.4, the GHN is trained with blocks with N = 7 nodes and T = 5 propagations under the forward-backward scheme, using the ADAM optimizer (Kingma & Ba, 2015). Training details of the ﬁnal selected architectures are chosen to follow existing works and can be found in the Appendix. [...] the ﬁnal candidates are trained for 600 epochs using SGD with momentum 0.9, a single period cosine schedule with lmax = 0.025, and batch size 64.