reproducibilityindex.ai

Measuring the Intrinsic Dimension of Objective Landscapes

Authors: Chunyuan Li, Heerad Farkhoor, Rosanne Liu, Jason Yosinski

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We begin in Sec. 2 by deﬁning more precisely the notion of intrinsic dimension as a measure of the difﬁculty of objective landscapes. In Sec. 3 we measure intrinsic dimension over a variety of network types and datasets, including MNIST, CIFAR-10, Image Net, and several RL tasks. Based on these measurements, we draw a few insights on network behavior, and we conclude in Sec. 4.
Researcher Affiliation	Collaboration	Chunyuan Li Duke University cl319@duke.edu Heerad Farkhoor, Rosanne Liu, and Jason Yosinski Uber AI Labs {heerad,rosanne,yosinski}@uber.com
Pseudocode	No	The paper describes methods and procedures in narrative text, but it does not include any structured pseudocode or algorithm blocks with formal labels such as "Pseudocode" or "Algorithm".
Open Source Code	No	The paper does not include any explicit statements about releasing its source code for the described methodology, nor does it provide links to a code repository.
Open Datasets	Yes	We begin in Sec. 2 by deﬁning more precisely the notion of intrinsic dimension as a measure of the difﬁculty of objective landscapes. In Sec. 3 we measure intrinsic dimension over a variety of network types and datasets, including MNIST, CIFAR-10, Image Net, and several RL tasks. Based on these measurements, we draw a few insights on network behavior, and we conclude in Sec. 4. ... We scale to larger supervised classiﬁcation problems by considering CIFAR-10 (Krizhevsky & Hinton, 2009) and Image Net (Russakovsky et al., 2015). ... DQN on Cartpole We start with a simple classic control game Cart Pole v0 in Open AI Gym (Brockman et al., 2016).
Dataset Splits	No	The paper mentions "validation accuracy" and discusses its use in performance measurement, stating, "In supervised classiﬁcation settings, validation accuracy is used as the measure of performance". However, it does not specify the exact percentages or counts for training, validation, and test splits, or reference predefined splits with explicit details needed for reproduction.
Hardware Specification	No	The paper mentions distributed training across "4 GPUs" for ImageNet experiments ("The training of each intrinsic dimension takes about 6 to 7 days, distributed across 4 GPUs."), but it does not specify the model or manufacturer of these GPUs, nor does it provide details about CPUs, memory, or other hardware components used.
Software Dependencies	No	The paper mentions the use of "Tensorﬂow s Sparse Tensor implementation" but does not provide a specific version number for TensorFlow or any other software libraries, dependencies, or frameworks used in the experiments.
Experiment Setup	Yes	We perform a grid sweep of networks with number of hidden layers L chosen from {1, 2, 3, 4, 5} and width W chosen from {50, 100, 200, 400}. ... The same set of experiments are run with both SGD (learning rate 0.1) and ADAM (learning rate 0.001)... Table S3: Hyperparameters used in training RL tasks using ES. σ refers to the parameter perturbation noise used in ES. Default Adam parameters of β1 = 0.9, β2 = 0.999, ϵ = 1 10 7 were used. ... ℓ2 penalty Various amount of ℓ2 penalty from {10 2, 10 3, 5 10 4, 10 4, 10 5, 0} are considered. ... Dropout Various dropout rates from {0.5, 0.4, 0.3, 0.2, 0.1, 0} are considered.