reproducibilityindex.ai

No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths

Authors: Charles Guille-Escuret, Hiroki Naganuma, Kilian Fatras, Ioannis Mitliagkas

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct our experiments on image classification, semantic segmentation and language modeling across different batch sizes, network architectures, datasets, optimizers, and initialization seeds. We discuss the impact of each factor.
Researcher Affiliation	Collaboration	1Mila, Montreal, Canada 2Université de Montréal Montreal, Canada 3University of Mc Gill, Montreal, Canada 4Dreamfold 5Archimedes Unit, Athena Research Center, Athens.
Pseudocode	Yes	A detailed description of our experimental protocol is provided in Algorithm 1 in Appendix A.1, and we share our code at https://github. com/Hiroki11x/Loss Landscape Geometry.
Open Source Code	Yes	Our code can be found at the link below. https://github.com/Hiroki11x/ Loss Landscape Geometry
Open Datasets	Yes	The CIFAR-10 dataset (Krizhevsky et al., 2012), one of the most widely used datasets for machine learning research... Image Net-1K (Deng et al., 2009)... Wiki Text-2 dataset (Logan et al., 2019)... The Vaihingen dataset (Rottensteiner et al., 2012)
Dataset Splits	Yes	The dataset is split into two segments: a training set comprising 50,000 images and a test set of 10,000 images. (CIFAR-10) ... The dataset is divided into three segments: a training set with roughly 2.08 million tokens, a validation set with approximately 217,000 tokens, and a test set with about 245,000 tokens. (Wiki Text-2) ... It is composed of 33 tiles and we use 11 tiles for training, 5 tiles for validation, and the remaining 17 tiles for testing our model
Hardware Specification	Yes	For cluster A, each node is composed of NVIDIA A100 4GPU and AMD Milan 7413 @ 2.65 GHz 128M cache L3 2CPU.
Software Dependencies	Yes	As a software environment, we use Rocky Linux 8.7, gcc 9.3.0, Python 3.10.2, pytorch 1.13.1, torchvision 0.14.1, cu DNN 8.2.0, and CUDA 11.4.
Experiment Setup	Yes	The specifics concerning the batch size and the total number of epochs allocated for each dataset and corresponding model have been exhaustively tabulated in Table 1. ... Further, we present detailed settings of specific ablation experiments in Table 2, 3,4, and 5.